Molecular Vision 2014; 20:1491-1517
<http://www.molvis.org/molvis/v20/1491>
Received 05 May 2014 |
Accepted 02 November 2014 |
Published 04 November 2014
Thanh V. Hoang,1 Praveen Kumar Raj Kumar,1 Sreeskandarajan Sutharzan,1 Panagiotis A. Tsonis,2 Chun Liang,1 Michael L. Robinson1
The first two authors contributed equally to the work
1Department of Biology, Miami University, Oxford, OH; 2Department of Biology and Center for Tissue Regeneration and Engineering, University of Dayton, Dayton, OH
Correspondence to: Michael L. Robinson, Department of Biology, Miami University, Oxford, Ohio 45056; Phone: (513) 529-2353; FAX: (513) 529-6900; email: Robinsm5@miamioh.edu
Purpose: The ocular lens contains only two cell types: epithelial cells and fiber cells. The epithelial cells lining the anterior hemisphere have the capacity to continuously proliferate and differentiate into lens fiber cells that make up the large proportion of the lens mass. To understand the transcriptional changes that take place during the differentiation process, high-throughput RNA-Seq of newborn mouse lens epithelial cells and lens fiber cells was conducted to comprehensively compare the transcriptomes of these two cell types.
Methods: RNA from three biologic replicate samples of epithelial and fiber cells from newborn FVB/N mouse lenses was isolated and sequenced to yield more than 24 million reads per sample. Sequence reads that passed quality filtering were mapped to the reference genome using Genomic Short-read Nucleotide Alignment Program (GSNAP). Transcript abundance and differential gene expression were estimated using the Cufflinks and DESeq packages, respectively. Gene Ontology enrichment was analyzed using GOseq. RNA-Seq results were compared with previously published microarray data. The differential expression of several biologically important genes was confirmed using reverse transcription (RT)-quantitative PCR (qPCR).
Results: Here, we present the first application of RNA-Seq to understand the transcriptional changes underlying the differentiation of epithelial cells into fiber cells in the newborn mouse lens. In total, 6,022 protein-coding genes exhibited differential expression between lens epithelial cells and lens fiber cells. To our knowledge, this is the first study identifying the expression of 254 long intergenic non-coding RNAs (lincRNAs) in the lens, of which 86 lincRNAs displayed differential expression between the two cell types. We found that RNA-Seq identified more differentially expressed genes and correlated with RT-qPCR quantification better than previously published microarray data. Gene Ontology analysis showed that genes upregulated in the epithelial cells were enriched for extracellular matrix production, cell division, migration, protein kinase activity, growth factor binding, and calcium ion binding. Genes upregulated in the fiber cells were enriched for proteosome complexes, unfolded protein responses, phosphatase activity, and ubiquitin binding. Differentially expressed genes involved in several important signaling pathways, lens structural components, organelle loss, and denucleation were also highlighted to provide insights into lens development and lens fiber differentiation.
Conclusions: RNA-Seq analysis provided a comprehensive view of the relative abundance and differential expression of protein-coding and non-coding transcripts from lens epithelial cells and lens fiber cells. This information provides a valuable resource for studying lens development, nuclear degradation, and organelle loss during fiber differentiation, and associated diseases.
The ocular lens is an excellent model for studying development, physiology, and disease [1]. The mammalian lens is made up of only two cell types: epithelial cells, which comprise a monolayer of cells that line the anterior hemisphere of the lens, and fiber cells, which make up the remainder of the lens mass. The primary lens fiber cells result from differentiation of the cells in the posterior half of the lens vesicle while secondary fiber cells differentiate from lens epithelial cells displaced toward the equator by lens epithelial cell proliferation. During differentiation, lens epithelial cells undergo cell cycle arrest, elongate, and begin expressing genes characteristic of lens fiber cells [2]. Eventually, the differentiating fiber cells lose their nuclei and other intracellular organelles, such that the most mature lens fiber cells in the center of the lens exist in an organelle-free zone [3]. Lens growth, through epithelial cell proliferation and secondary fiber cell differentiation, occurs throughout the vertebrate lifespan.
Lens fiber cell differentiation is a highly coordinated process involving specific changes in gene expression between two different cell types. For example, several genes, including Pax6, Foxe3, and E-cadherin (Cdh1), are highly expressed in lens epithelial cells but downregulated in lens fiber cells [4-6]. In contrast, many other genes, including Aquaporin0 (Aqp0 or Mip), β- and γ-crystallins, CP49 (Bfsp2), and filensin (Bfsp1), are expressed at low levels in lens epithelial cells but are strongly upregulated during lens fiber cell differentiation [7-9]. However, a comprehensive understanding of gene expression changes during lens fiber differentiation remains incomplete. In particular, the expression and role of long non-coding RNAs during lens development largely await discovery.
Increasing evidence suggests an important role for long intergenic non-coding RNAs (lincRNAs) in regulating gene transcription and protein synthesis [10,11]. LincRNAs are non-coding transcripts more than 200 nucleotides long that are intergenically transcribed. LincRNAs can regulate gene expression via cis and trans mechanisms. LincRNAs potentially function in many different ways, including cotranscriptional regulation, bridging proteins to chromatin, and scaffolding of nuclear and cytoplasmic complexes [11]. Little information currently exists about the specific expression pattern or function of lincRNAs during lens development.
Microarrays provide a comprehensive approach for gene-expression studies [12]. Several previous investigations applied microarray technology to the lens, where transcriptional profiling was typically restricted to whole lenses [13,14], fiber cells [15], or lens epithelial explants [16-18]. However, microarrays have several limitations, including probe cross-hybridization, the selection of specific probes, and low detection thresholds that may reduce the ability to accurately estimate low-level transcripts. Additionally, novel transcripts and splice isoforms of annotated genes are often missed because microarray design often limits information to previously identified transcripts.
The application of next-generation sequencing (NGS) technology creates enormous potential to increase the sensitivity and resolution of genomic and comprehensive transcriptome analyses without many of the limitations of microarrays [19]. Visualization of mapped sequence reads spanning splice junctions can also reveal novel isoforms of previously annotated genes, which was not possible with microarrays [20,21]. Deep sequencing of RNA with NGS (RNA-Seq) permits comprehensive evaluation and quantification of all types of RNA molecules expressed in a cell or tissue, including mRNA and non-coding RNAs [22]. RNA-Seq can detect rare, novel, and alternatively spliced transcript isoforms. RNA-Seq also identifies rare mutations and RNA editing, and can elucidate non-model organism transcriptomes [23-25]. The declining costs and increased availability of RNA-Seq are fueling the emergence of this technology as the most powerful current method for comprehensive transcriptome profiling.
To date, two studies have used RNA-Seq in mouse lenses. However, these studies were restricted to mRNA expression profiling in whole lenses [26] or in lens epithelial explants [27]. Here, we present the first application of RNA-Seq to comprehensively investigate the transcriptomes of epithelial cells and fiber cells of newborn FVB/N mouse lenses. RNA-Seq results were then compared with those obtained with previously published microarray methods [28]. Furthermore, RT-qPCR confirmed the expression of several biologically important genes and provided a benchmark with which to compare the RNA-Seq and microarray analysis.
We found that RNA-Seq provided a more complete and accurate approach for comprehensive, comparative analysis of the lens transcriptome than previously described methods. RNA-Seq analysis of lens epithelial cells and lens fiber cells discovered 1,368 more differentially expressed transcripts than found in a similar microarray analysis of P13 C57BL/6J mouse lenses [28]. Surprisingly, only 22% of the genes (1,009) differentially expressed in this microarray study were differentially expressed in the current RNA-Seq analysis. The comprehensive and quantitatively continuous nature of RNA-Seq will facilitate better understanding of lens fiber differentiation and organelle destruction inherent in lens development.
Newborn, inbred FVB/N mice, or P13 C57BL/6J mice, were euthanized with CO2 inhalation before the lenses were removed. All procedures were approved by the Miami University Institutional Animal Care and Use Committee and complied with the ARVO Statement for Use of Animals in Research and consistent with those published by the Institute for Laboratory Animal Research (Guide for the Care and Use of Laboratory Animals).
Lenses were removed from the eye and carefully isolated by manual dissection from the blood vessels, retina, and cornea. The isolated lenses were dissected into two fractions. The epithelial fraction contained the lens capsules with adherent cells that include the entire central and equatorial epithelium and cells in the transitional zone. This fraction also likely contained some early fiber cells firmly attached to the capsule, although obvious elongated fiber cells in this fraction were manually removed. Although every attempt was made to physically remove the tunica vasculosa lentis, the lens epithelial fraction most likely contained cells from this vasculature. The fiber cell fraction included the bulk of the lens mass that was non-adherent to the lens capsule. Epithelial and fiber cell fractions were each pooled into three biologic replicate samples for each cell type. Each sample contained tissue from eight lenses from four newborn mice. Total RNA, including mRNA and lincRNA, was isolated using the mirVana isolation kit (Ambion, Life Technologies, Grand Island, NY). Total RNA samples with the RNA integrity number (RIN, Agilent 2100 Bioanalyzer) ≥8.0 were used to prepare a library of template molecules suitable for subsequent sequencing on an Illumina (St. Louis, MO) HiSeq platform. Briefly, polyadenylated RNA was purified from the total RNA samples using Oligo dT conjugated magnetic beads and prepared for single-end sequencing according to the Illumina TruSeq RNA Sample Preparation Kit v3. The library was sequenced for 50 cycles using the TruSeq SBS kit on an Illumina HiSeq 2000 system at the Genomics and Sequencing Core Laboratory at the University of Cincinnati. Approximately 29 million sequences that were 51 bp long per sample were generated. Bases were called and translated to generate FASTQ sequence files. All raw sequencing data were deposited at the NCBI’s Sequence Read Archive (SRA) and are accessible under the SRA accession number SRP040480.
Data preprocessing-- The six Illumina HiSeq data sets, three pooled biologic replicates from each tissue (designated E1, E2, and E3 for the epithelial samples and F1, F2, and F3 for the fiber cell samples), were individually assessed for quality using prinseq [29] and FastQC. Raw Illumina sequence reads were trimmed for low-quality reads (phred <28), ambiguous bases (N), sequencing adapters, primers, and poly(A)/(T) tails.
Reference mapping-- The trimmed reads were mapped to the Mus musculus reference genome (build GRC38.72, ENSEMBL/MGI annotations from C57BL/6J [30]) using the Genomic Short-read Nucleotide Alignment Program (GSNAP version gmap-gsnap-2013–09–30.v2) [31]. The GSNAP feature of single-nucleotide polymorphism (SNP) tolerant mapping was used with the available SNP information downloaded from ENSEMBL as Variant call format. The SNPs and the known splice sites were obtained from ENSEMBL genome annotation. Sequencing reads were mapped to the reference genome with two mismatches allowed. Unique and perfect mappings were identified using the Sequence Alignment Format (SAM) flag NH:i:1 and NM:i:0, respectively. For high confidence in calling gene expression, only uniquely and perfectly mapped reads (with the inclusion of SNPs) were used.
Estimation of gene expression and identification of differentially expressed genes-- Transcript assembly and abundance estimation were performed using Cufflinks 2.1.1 [32]. Gene expression levels were expressed as reads per kilobase per million (RPKM) mapped reads. The minimum threshold of 0.5 RPKM was used to define gene expression. The DESeq package was used to study differential gene expression [33], where the input count table was obtained using HTseq. The significance of differentially expressed genes was identified with an adjusted p value (in DESeq, the adjusted p value considers multiple testing using the Benjamini-Hochberg method [33]) less than 0.05 (at 95% confidence) and with a fold change greater than 1.5. Protein-coding genes and non-coding genes were then sorted using Perl scripts. A Venn diagram was created using VENNY.
The GOseq package was used to find Gene Ontology (GO) categories, including biochemical process, cellular components, and molecular function, to identify particular functions enriched among differentially expressed genes. GOseq has the ability to account for gene length and read count biases [34].
To validate the RNA-Seq data, the expression level of selected genes was analyzed with RT-qPCR. Genes that have low and high expression levels were chosen. The same epithelial and fiber cell RNA samples used for RNA-Seq were reverse transcribed into cDNA, using random primers and Superscript II reverse transcriptase (Invitrogen), according to the manufacturer’s instructions. After incubation at 42 °C for 50 min, the reaction was stopped by heating to 75 °C for 15 min. The qPCR assays were performed on the cDNA using GoTaq Green Master Mix (Promega, Madison, WI) following the manufacturer’s instructions and read using a CFX connect instrument (Bio-Rad, Hercules, CA). Intron-spanning primers were designed to specifically quantify targeted mRNA transcripts (Appendix 1). Although we did not perform an RNA-Seq analysis of C57BL/6J RNA, we compared our RNA-Seq results to a previous microarray study of lens epithelial cells and lens fiber cells from P13 C57BL/6J mice. Therefore, RNA from P13 C57BL/6J lens epithelial cell and lens fiber cell fractions was analyzed with RT-qPCR to allow for more direct comparison with a previously performed differential gene expression analysis with microarray [28]. Each biologic sample was analyzed in triplicate with qPCR. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) expression was used as an endogenous control. The cycling conditions consisted of 1 cycle at 95 °C for 100 s for denaturation, followed by 40 three-step cycles for amplification (each cycle consisted of 95 °C incubation for 20 s, an appropriate annealing temperature for 10 s, and product elongation at 70 °C incubation for 20 s). The melting curve cycle was generated after PCR amplification. The reaction specificity was monitored by determining the product melting temperature and by checking for the presence of a single DNA band on agarose gels from the RT-qPCR products. To calculate the amplification efficiencies (E) of the qPCR primers, standard curves were obtained by performing qPCR reactions on serially diluted samples; the PCR efficiencies (E) were calculated based on the slopes of the standard curves (E=10 (−1/slope) −1). The quantification cycle (Cq, also commonly called the Ct) was obtained, and the ΔCq value was calculated with Cq (gene) − Cq (GAPDH). For fold-change (FC) expression, ΔΔCq was first calculated by subtracting the mean of the ΔCq values of the fiber samples from the meanΔCq values of epithelial samples and then converted to 2 (−ΔΔCq). The data are expressed as mean ± standard error of the mean (SEM).
Previously, gene expression profiling in the lens used expressed sequence tag analysis, microarrays, and RNA-Seq analysis; however, these studies focused on examining the whole lens [13,14,26,35-38], lens fiber cells [15], or lens epithelial explants [16-18,27]. Unlike prior studies, the current work examined changes in the expression of protein-coding genes and non-coding genes from epithelial cells and fiber cells in vivo. To our knowledge, these studies represent the first application of RNA-Seq to identify the transcriptional changes underlying epithelial cell differentiation into fiber cells during mouse lens development.
The lens epithelial and fiber samples in this RNA-Seq analysis consisted of cells physically separated based on adherence to the lens capsule. This method of separation raises important considerations for interpreting gene expression data. Lens epithelial cells at the anterior surface of the lens can be grouped into central and peripheral regions [39]. Epithelial cells in the central region are considered mitotically quiescent, while epithelial cells in the peripheral region can be divided into the germinative zone and the transitional zone. The germinative zone epithelium is mitotically active, while the transitional zone epithelial cells closest to the equator undergo cell cycle arrest and align into meridional rows before differentiating into fiber cells [40,41]. These earliest differentiating transitional zone cells remain adherent to the lens capsule and represent a fraction of the transcripts in the “lens epithelial” fraction in the current RNA-Seq analysis. This, coupled with the possible adherence of a few early differentiating fiber cells, may explain the appearance of “fiber cell-specific” transcripts such as those for Aqp0 and Gja3 in the lens epithelial RNA-Seq data (see below). Likewise, the lens fiber fraction is heterogeneous, containing young, cortical fiber cells in the outer layer and fully mature, organelle-free fiber cells in the center of the lens [42]. Therefore, despite dividing the lens transcripts into epithelial and fiber cell fractions, the result of the RNA-Seq analysis reflected the transcriptional changes in somewhat heterogenous cell populations resulting from the isolation method.
RNA-Seq obtained approximately 25 to 33 million raw sequence reads for each of the six cDNA libraries (three from capsular adherent lens epithelial cells and three from fiber cells) obtained from pooled newborn FVB/N mouse lenses. Trimmed reads that passed the quality filter were mapped to the mouse C57BL/6J reference genome (GRC38.72) using the software GSNAP with the SNP tolerant mapping with SNPs known to differ between C57BL/6J and FVB/N mice. On average, 90% of the total reads were successfully mapped to the reference genome (Table 1). Using Cufflinks for estimating transcript abundance, with an expression-level threshold of RPKM ≥0.5, the lens epithelial fraction contained transcripts from 13,732 genes and the fiber cells expressed 10,850 genes for a total of 14,060 genes expressed in the lens. The epithelial and fiber cell fractions coexpressed 10,522 (74.8%) of these genes. Using DESeq (p value ≤0.05 and fold change ≥1.5, Figure 1, Appendix 2) for examining differential gene expression, the lens epithelial cells and the lens fiber cells differentially expressed 6,022 protein-coding genes. Of these genes, the lens epithelial cells upregulated 3,233 genes, and the lens fiber cells upregulated 2,789 genes. Among 1,746 annotated mouse lincRNA genes, the lens expressed 254 lincRNAs. Of these, 86 lincRNA genes were differentially expressed between the two lens cell types, with epithelial cells upregulating 32 genes and fiber cells upregulating 54 genes (Appendix 3).
Genes with the highest expression level in the epithelial cells (ranked by RPKM values) included several crystallins, clusterin (Clu), collagens (Col4a1 and Col4a2), osteonectin (Sparc), heat shock protein 90-beta (Hsp90ab1), glucose transporter-1 (Slc2a1), and heparan sulfate proteoglycan-2 (Hspg2; Appendix 4). Furthermore, lens epithelial cells highly expressed many mitochondrial-encoded genes, including NADH dehydrogenases (mt-Nd1, mt-Nd2, mt-Nd4, and mt-Nd5), cytochrome b (mt-Cytb), tRNA leucine 1 (mt-Tl1), and leucyl-tRNA synthetase-2 (Lars2). Interestingly, although not differentially expressed, the epithelial and fiber cells expressed abundant transcripts for the novel lincRNA, RP23–81C12.3.
The fiber cell RNA libraries contained numerous transcripts for genes encoding several different crystallins, as well as several structural proteins (Lim2, Bfsp1, Sparc, and Vim), CD24a antigen (Cd24a), Tudor domain containing 7 (Tdrd7), connexins (Gja3 and Gja8), and cyclin-dependent kinase inhibitor 1C (Cdkn1c, also known as p57KIP2; Appendix 5). The fiber cells also expressed several lincRNAs, including RP23–81C12.3, metastasis-associated lung adenocarcinoma transcript 1 (Malat1), growth arrest specific 5 (Gas5), and maternally expressed gene 3 (Meg3).
Genes differentially expressed (DEGs) in the epithelial cell and fiber cell fractions were first filtered based on whether they were upregulated in the epithelial cells or fiber cells. Genes were then ranked by the lowest adjusted p value rather than the greatest fold change to avoid large fold-change differences that failed to achieve sufficient significance as a result of low expression levels. The most significantly differentially expressed genes, ranked by adjusted p value, in the epithelial cells included arylsulfatase family member I (Arsi), immunoglobulin superfamily containing leucine-rich repeat protein (Islr), sulfatase-1 (Sulf1), folate receptor-1 (Folr1), B-cell translocation gene-1 (Btg1), nepronectin (Npnt), vascular endothelial growth factor receptor-1 (Flt1), and cyclin dependent kinase-1 (Cdk1; Figure 2A). The most significantly DEGs in the fiber cells tended to be those that encode lens structural proteins and membrane proteins, including several different crystallins, filensin (Bfsp1), gap junction protein epsilon-1 (Gje1), aquaporin0 (Aqp0 also known as Mip), tropomodulin1 (Tmod1), CD24 molecule (Cd24a), solute carrier family 24 (Slc24a4), and lens fiber membrane intrinsic protein (Lim2; Figure 2B). The structure and transparency of the lens require a high concentration (about 350 mg/ml) of crystallins, which represent the largest class of lens structural proteins. Mutations in several different crystallin genes result in cataract in mice and humans [43]. Len cell membranes contain high concentrations of the lens-specific water channel, Aqp0, as well as gap junctions and ion channels, which cooperate to allow the lens to transport nutrients and waste products despite the lack of vasculature in the mature lens [44].
Lens fiber cells upregulated the expression of Dnase2b, Tdrd7, Hsf4, and Tmod1, consistent with the importance of these genes in lens development and fiber cell differentiation. DNase2b activity is required for nuclear DNA degradation in the cortical fiber cells, and loss of Dnase2b leads to nuclear cataracts in mice [45]. Tdrd7, a member of a family of proteins that form RNA granules, has been shown to regulate gene expression in the lens by controlling the fate of mRNA. Loss-of-function mutations in TDRD7 in humans and Tdrd7 nullizygosity in mice cause cataracts as well as glaucoma [46]. Hsf4 mutations are associated with autosomal dominant lamellar and Marner cataract [47]. Hsf4 null mice have cataracts with abnormal fiber cells [48], due to downregulation of gamma-crystallin, Bfsp1, and Dnase2b expression [49,50]. Previous studies have shown that Tmod1 is required for coordinating fiber cell shapes and geometry during lens fiber elongation and maturation by stabilizing the spectrin-actin network in the lens fiber cell membrane. Deletion of Tmod1 leads to fiber cell disorder and impairs lens transparency [51,52].
Although Pla2g7, Cd24a, and Caprin2 were highly expressed in the fiber cells, previous studies demonstrated that normal lens development occurs in the absence of these genes. Pla2g7 is a secreted enzyme that catalyzes the degradation of platelet-activating factor into inactive products [53] and has been shown to be involved in arthrosclerosis and coronary diseases [54,55]. Pla2g7 null mice grow to adulthood without a reported eye phenotype [56]. Cd24a, a cell-surface glycoprotein, is highly expressed in fiber cells in our data, consistent with a previous study [57]. However, no eye phenotype was reported for Cd24a-knockout mice [58]. Caprin2 was highly expressed in the lens primary fiber cells and the secondary fiber cells in mouse and chick lens, and that expression was induced by fibroblast growth factor (FGF) signaling [59].
RNA-Seq data also demonstrated strong expression of Clu, Hsp90ab1, Fabp5, Thsd4, and Ctnna2 in the fiber cells, suggesting that these genes may be biologically important for fiber differentiation or function. Clu and Hsp90ab1encode chaperone proteins and may play an important role in preventing protein misfolding and maintaining lens transparency in the fiber cells. Fabp5 is involved in fatty acid uptake and metabolism [60], but no lens phenotype was reported in mice with targeted mutations in Fabp5 [61]. Pgam2 is an important enzyme in glycolytic pathway [62], but its expression and function in the lens has not been previously examined. Thsd4, also known as ADAMTSL4, is a secreted glycoprotein and is found in differentiating fiber cells. Thsd4 interacts with fibrillin-1 and promotes fibril formation [63]. Mutations in Thsd4 cause ectopia lentis, which is malposition of the lens from its normal position [64]. However, the function of Thsd4 in the lens fiber cells remains unknown. The Ctnna2 gene encoding cytoplasmic alpha-N-catenin is strongly expressed in the lens fibers [65]. It functions by interacting with actin cytoskeleton with beta-catenin at cell junctions [66]. It is not clear whether the effect of the loss of αN-catenin on the lens was examined in mice [67].
The lens also upregulates the expression of Lctl during differentiation. Lctl encodes a novel Klotho-lactase-phlorizin hydrolase-related protein [68]. Klotho and βKlotho act as coreceptors for the FGF15/19 subfamily of endocrine FGFs. Although the function of Lctl in vivo remains unknown, expression of Lctl potentiated FGF19-induced ERK1/2 phosphorylation in HEK293 cells [69].
Nakahara et al. previously compared gene expression between mouse lens epithelial cells and lens fiber cells using a microarray [28]. This microarray analysis examined P13 lenses from the C57BL/6J mouse inbred strain and, like our RNA-Seq analysis, used a ±1.5 fold-change cutoff for differential expression. Lens epithelial and lens fiber cell fractions in the microarray analysis were isolated identically as described for the current RNA-Seq analysis. However, although our RNA-Seq analysis used newborn FVB/N lens RNA with three biologic replicates per fraction the previously published microarray lacked biologic replicates. The lack of biologic replicates prevented the assessment of p values for the microarray analysis.
When comparing the results of the RNA-Seq data with that obtained in this microarray study, the microarray detected 4,654 DEGs, much lower than that detected by RNA-Seq (6,022 DEGs). Of these DEGs discovered with the microarray, the lens epithelium upregulated 1,812 genes, and the fiber cells upregulated 2,842 genes. In contrast to RNA-Seq, microarray analysis requires previous knowledge of gene transcripts to construct hybridization probes, restricting analyses to transcripts built on the array. Therefore, the specific microarray used prevented the analysis of several genes (for example, Cdkn1b) due to a lack of specific probes for hybridization. In total, 1,009 DEGs were commonly identified with RNA-Seq and microarray analysis (Figure 3). This figure represents only 22% of the differentially expressed genes identified previously by microarray.
To validate the RNA-Seq analysis, transcripts from 19 genes that have a wide range of expression (from no change to different levels of differential expression) were chosen for the RT–qPCR analysis. These genes included those involved in cell cycle regulation, receptor tyrosine kinase signaling, DNA methylation, and cytosolic enzymes. RT–qPCR facilitated the analysis of gene expression levels, from P0 FVB/N and P13 C57BL/6J lenses, to compare the RNA-Seq data obtained from P0 FVB/N RNA with the previous microarray data obtained from P13 C57BL/6J RNA. As in the RNA-Seq and microarray, we maintained a 1.5 FC as the threshold for differential gene expression with negative FC values indicating lower expression in the fiber cell fraction compared with the epithelial cell fraction. For genes that exhibit a positive FC, our RNA-Seq data correlated well with the RT-qPCR results from P0 FVB/N samples (Figure 4A), except Mapk3 and Atg3, which were not differentially expressed according to RT-qPCR (FC=1.11 and 1.36, respectively). Meanwhile, the correlation of the microarray data with the RT-qPCR results for P13 C57BL/6J samples was weaker, notably in the cases of Fgfr3, Evt5, and Prox1 expression. For genes more highly expressed in the epithelium (Figure 4B) and genes with large differential expression (Figure 4C), the RNA-Seq data also exhibited better agreement with the RT-qPCR results than the microarray analysis. The lack of a specific microarray probe for Cdkn1b prevented microarray data analysis for this gene. Overall, the RNA-Seq data and the microarray data trended in the same direction, regarding differential gene expression, as the RT-qPCR analysis. However, the RNA-Seq data correlated more closely with the RT-qPCR data than with the microarray data.
The use of different inbred mouse strains and different aged lenses (P0 FVB/N in the RNA-Seq and P13 C57BL/6J in the microarray) likely contributed to some specific differences in the conclusions between these two analyses (for example, see Figure 4A for Atg3, and Lgmn) where the RT-qPCR of P13 C57BL/6 RNA correlated most closely with the microarray. However, RT-qPCR validation using P0 FVB/N and P13 C57BL/6 RNA confirmed that RNA-Seq often more closely predicted the gene expression level than the microarray (for example, see Figure 4A for Mapk1, Etv5, Spry2, and Prox1).
Genes upregulated in epithelial cells and genes upregulated in fiber cells were analyzed for GO enrichment using the GOseq software package, which corrects for gene length and read count biases during RNA-Seq [34]. A comparison of the top GO terms (based on p value) for cellular component, molecular function, and biologic process for genes more highly expressed in epithelial cells or fiber cells is shown in Figure 5. For cellular component terms, genes upregulated in epithelial cells are enriched for components of the extracellular matrix, plasma membrane, cell surface, chromosomes, and intracellular organelles, whereas genes upregulated in fiber cells show enrichment for proteosome complexes, the cytoskeleton, and respiratory chain (Figure 5A).
For molecular function terms, genes upregulated in epithelial cells were enriched for transcripts that encode protein kinase and protein dimerization activities. Transcripts encoding binding activities, include calcium binding, ion binding, growth factor binding, protein kinase binding, receptor binding, integrin binding, and glycosaminoglycan binding, were abundantly expressed in the lens epithelial cell fraction. Not surprisingly, the transcripts upregulated in fiber cells were enriched for structural constituents of the eye lens, cytoskeletal protein binding, hydrolase activity, unfolded protein binding, and ubiquitin binding (Figure 5B). Biologic process terms enriched in genes upregulated in epithelial cells include development, cell cycle, cell division, and cell migration. This is consistent with the fact that epithelial cells undergo active proliferation and migration before fiber cell differentiation (Figure 5C). In contrast, genes upregulated in fiber cells were enriched for transcripts encoding transport, localization, and metabolic processes.
In addition to heterogeneity within the lens cell populations, potential non-lens transcript contamination should be considered. Although the lenses were isolated from the eyes and carefully dissected from the blood vessels, retina, and cornea, the biologic process term enrichment of the RNA-Seq data showed that vascular development was enriched in epithelial-upregulated genes (Figure 5C). When looking at the expression of endothelial markers in the RNA-Seq data, Pecam-1 and Flt1 (also known as Vegfr1) were highly expressed in epithelial fractions, indicating the possibility of contaminating blood vessels. However, we cannot rule out that the gene expression in epithelial fractions is indeed bona fide. Opn1sw, a retinal marker, and Rpe65, an RPE cell marker, are not expressed in the lens [70]. Consistent with this study, we did not detect the expression of those genes in either lens epithelial or fiber cells, excluding the possibility of the contaminating retina and RPE.
The RNA-Seq data were analyzed for differentially expressed genes involved in selective signaling pathways that play important roles during lens development and fiber cell differentiation. Since others have discussed transcription factor expression in the developing lens [71], we will omit specific discussion of transcription factors.
Among more than 40 receptor tyrosine kinases (RTKs) expressed in the lens, FGF receptors (Fgfrs) showed the highest gene expression levels in epithelial cells and fiber cells (Table 2). The abundant expression of Fgfrs is consistent with the known essential role for Fgfr signaling in lens development [72]. Of the four genes encoding Fgfrs capable of active tyrosine phosphorylation (Fgfr1–4), Fgfr4 expression was barely detected in the lens RNA-Seq data. Fgfr3 expression (the most highly expressed of the Fgfrs in the lens) was upregulated in fiber cells as confirmed with our RT-qPCR (Figure 4A) and was shown in previous studies [73,74]. There was no significant difference in Fgfr1 and Fgfr2 expression levels between the epithelial and fiber cell fractions. Eph receptor A2 and the Met proto-oncogene were also upregulated in fiber cells. Most of the other tyrosine kinase receptors were expressed more abundantly in the epithelial cells than in the fiber cells. These include discoidin domain receptor family member 1 (Ddr1), platelet derived growth factor receptors (Pdgfra and Pdgfrb), AXL receptor tyrosine kinase (Axl), Eph receptors (Ephb2, Ephb4, and Ephb6), and kinase insert domain protein receptor (Kdr).
Notch signaling has been shown to regulate cell growth and differentiation in the mammalian lens [75,76]. We found that the majority of genes involved in Notch signaling pathways, including the Notch receptors (Notch1, Notch2, Notch3 and Notch4), ligands (Dll1, Dll4, and Jag2), and effector genes (Hes1 and Hes5) of this pathway, were expressed more abundantly in the epithelial cells, indicating that Notch signaling is more active in epithelial cells than in fiber cells (Table 3), consistent with previous functional studies [77,78].
Wnt signaling plays an important role in regulating eye development [79,80], including lens epithelial and fiber cell differentiation [81-83]. Our data showed that among ten Wnt receptors, five Wnt receptors (Fzd1, Fzd2, Fzd4, Fzd7, and Fzd 8) were significantly upregulated in the epithelial cells, while only Fzd3 was slightly upregulated in the fiber cells (Table 4). Fzd6 expression was not significantly different between the two cell types. Lens epithelial cells preferentially expressed the Wnt ligands, Wnt5a and Wnt11, more abundantly in epithelial cells, while fiber cells upregulated the expression of Wnt7a and Wnt7b. The lens epithelial cells and the fiber cells expressed the Wnt signaling effectors, Dvl1, Dvl2, and Dvl3, but only Dvl2 exhibited differential expression with 2.47-fold more transcripts in the lens epithelium (Table 4).
Bone morphogenetic protein (BMP) signals also play an essential role in lens development. In fact, lens induction requires BMP signaling [84,85], and BMP signaling interacts with FGF signaling to regulate cell cycle exit in primary lens fiber differentiation [86-88] and secondary lens fiber differentiation [89]. Of 21 transforming growth factor-β (Tgfβ) superfamily receptors, the fiber cells expressed bone morphogenetic protein receptor 1A (Bmpr1a) most abundantly and upregulated Bmpr1b (Appendix 6). The epithelial cells and the fiber cells expressed Bmpr2 as well as activin A receptor IB and IIB (Acrv2b and Acrv1b). Most other Tgfβ receptors, including transforming growth factor-β receptor I (Tgfbr1), transforming growth factor-β receptor II (Tgfbr2), neural cell adhesion molecule 1 (Ncam1), activin A receptor type 1 (Acvr1), and endogen (Eng), were expressed more abundantly in epithelial cells than in fiber cells.
RNA-Seq of the newborn lenses detected a limited number of TGFβ superfamily ligand transcripts, and almost all of the differentially regulated transcripts were preferentially expressed in the epithelial cells (Appendix 6). Ligands showing the highest expression level are bone morphogenetic protein-1 (Bmp1), transforming growth factor-β2 (Tgfb2), bone morphogenetic protein-7 (Bmp7), and inhibin α (Inha). Other ligands, including Bmp3, Bmp5, Bmp6, Bmp8a, Bmp8b, Bmp10, and Bmp15, were not detected in either cell type. Overall, the upregulated expression of the Tgfβ superfamily receptors and ligands in the lens epithelial cells may indicate that Tgfβ signaling is more active in the epithelial cells than in the fiber cells.
The final stages of lens fiber cell differentiation require nuclear degradation to promote lens transparency. The destruction of nuclear DNA requires DNases (DNases). Lens epithelial cells and lens fiber cells express transcripts for DnaseI, Dnase2a, and Dnase2b. Transcripts for Dnase2b outnumber the other two DNase transcripts in the lens. In particular, fiber cells upregulate DNase2b transcripts 100 fold relative to the lens epithelium (Table 5). The induction of DNase2b transcripts in fiber cells is consistent with the known importance of this enzyme for DNA destruction during fiber cell denucleation [28,45].
The failure to repair DNA double-stranded breaks (DSBs) can lead to programmed cell death [90]. Previous studies showed that this type of DNA damage was generated during normal fiber cell denucleation [91,92]. The possibility exists that reduced DNA repair activity plays a mechanistic role during lens fiber cell denucleation. Consistent with this view, transcripts of many key components of the DNA repair pathway were downregulated in the fiber cells (Table 5).
The induction of proteolytic pathways that trigger organelle loss during lens fiber cell differentiation bears some resemblance to the initiation of programmed cell death (apoptosis) [93,94]. The p53 family members (p53, p63, and p73) are tumor suppressor genes and regulate apoptosis and/or differentiation [95]. RNA-Seq analysis found p53 transcripts were more abundant in the lens than p63 or p73 transcripts. The number of p53 transcripts in the epithelial cells also exceeds the number in fiber cells (Appendix 7), consistent with previous evidence showing that p53 is expressed in epithelial cells and the lens fiber bow region [96]. The RNA-Seq data also revealed high expression of genes including the p53 regulator Mdm2, as well as several members of the Bcl family of apoptosis regulators in epithelial and fiber cells.
The lens also expresses genes involved in proteolytic pathways, including caspases, caspase inhibitors, autophagy-related genes, capthepsins, and calpains (Appendix 7). Among all caspases, the lens most highly expresses caspase-7, where fiber cells possessed increased caspase 7 transcripts relative to epithelial cells. Meanwhile, transcripts for caspase-2 and -9 were more abundant in the epithelial cells. Fiber cells also upregulated caspase inhibitors (IAPs), Birc7 and Birc2, whereas the lens epithelium upregulated Xiap and Birc5 expression. Epithelial cells strongly expressed cathepsin-L and cathepsin-F, while fiber cells preferentially expressed cathepsin-D and cathepsin-Z. Fiber cells also strongly upregulated transcripts of calpain-1, calpain-2, and calpain-3, suggesting that calpain-mediated protein degradation may play an important role in organelle loss during fiber cell differentiation.
Aquaporins-- Aquaporins are membrane water transport proteins that are found in many cell types [97]. Aquaporin-0, also known as the major intrinsic protein (Mip), is the most abundant protein in lens fiber cell membranes. The RNA-Seq data showed that transcripts for this lens-specific aquaporin outnumbered all other aquaporins in the lens with fiber cells expressing 60-fold more Aqp0 transcripts than epithelial cells (Appendix 8). The expression of aquaporin-5 (Apq5) and aquaporin-11 (Aqp11) was low in both cell types, while the lens epithelium preferentially expressed transcripts for aquaporin 1 (Aqp1), aquaporin 4 (Aqp4), and aquaporin-8 (Aqp8). Other aquaporin genes, including aquaporin-7 (Aqp7), aquaporin-2 (Aqp2), aquaporin-6 (Aqp6), aquaporin-9 (Aqp9), and aquaporin-12 (Aqp12), were minimally expressed or absent in the newborn mouse lens. Numerous mutations in Aqp0 give rise to cataracts in humans and mice [98], and although Aqp1 mutant mice display increased sensitivity to stress induced cataract [99], humans with Aqp1 mutations have normal lenses. Aqp5 null mice also fail to develop cataracts [100].
Gap junction proteins-- Gap junction-mediated coupling in lens cells plays a particularly important role in lens homeostasis. In fact, lens transparency depends on the activity of sodium pumps, gap junctions, and aquaporins to maintain fluid transport in the lens [101]. Most gap junction genes exhibited higher expression in the fiber cells. The most abundantly expressed gap junction transcripts in the fiber cells included gap junction protein-α3 (Gja3), also known as connexin 46, gap junction protein-α8 (Gja8), also known as connexin 50, and gap junction protein-ε1 (Gje1), also known as connexin 23(Appendix 8). Transcripts for gap junction protein-α1 (Gja1), also known as connexin 43, gap junction protein gamma-1 (Gjc1), also known as connexin 45, and gap junction protein-α4 (Gja4), also known as connexin 37, were upregulated in the epithelial cells. Mutation of connexins, including Gje1, Gja3, and Gja8, causes microphthalmia and cataracts [102,103]. Lim2 is an integral protein found in cell gap junctions. A missense mutation of LIM2 in the human lens [104] and targeted disruption of Lim2 in the mouse lens [105] also cause cataracts.
Intermediate filaments-- Intermediate filaments (IFs) are a key component of the cytoskeleton of all vertebrate cells. The lens exclusively expresses a subset of specialized intermediate filament proteins known as bead filament proteins. The beaded filament proteins in the lens are commonly called filensin (Bfsp1) and phakinin (Bfsp2) [44]. The RNA-Seq data showed the highest expression of transcripts for Bfsp1, Bfsp2, vimentin (Vim), and synemin (Synm) in the fiber cells (Appendix 8). Previous studies reported that mouse FVB/N and 129 inbred strains harbor a natural 6-kb deletion mutation in Bfsp2 gene. This deletion causes exon 1 to inappropriately splice to exon 3. This incorrect mRNA splicing changes the reading frame of the Bfsp2 transcript. As a result, a stop codon at position 2 of exon 3 in the Bfsp2 transcript prevents the translation of functional BFSP2 protein in the lens [106,107]. Consistent with these studies, we did not find any reads mapping to the exon 2 of Bfsp2 gene in our FVB/N lens samples (data not shown). Other IFs were expressed at moderate levels, including lamin A (Lmna), lamin B1 (Lmnb1), lamin B2 (Lmnb2), and nestin (Nes), with Lmnb1, Lmnb2, and Nes expressed more in epithelial cells. There was low expression of several keratin genes in the lens, including keratin 10 (Krt10), keratin 18 (Krt18), keratin 19, (Krt19), and keratin 40 (Krt40).
LincRNAs are a group of newly identified non-coding RNAs that play an important role in gene transcription and protein translation [11]. The expression pattern and functional role of lincRNAs in the lens are largely unknown. Like mRNAs, lincRNA transcripts undergo polyadenylation. Thus, lincRNAs are subjected to RNA-Seq via polyA selection [11,108,109]. Among 1,746 annotated lincRNA genes, we detected the expression of 254 lincRNAs in the lens, of which 86 lincRNA genes were differentially expressed (32 genes upregulated in epithelial cells and 54 genes upregulated in fiber cells). In the top 30 most differentially expressed lincRNAs (ranked by p values, Figure 6), most lincRNA genes were upregulated in fiber cells, including RP23–237H8.2, AC135859.1, AL663030.1, AC128663.1, and AC100730.1. Little information exists concerning the specific functions of these lincRNAs. LincRNAs upregulated in epithelial cells include AC154449.1, RP23–159J2.2, 2810433D01Rik, Gm13889, Mir17hg, AL732311.1, and AC020971.1.
Maternally expressed gene 3 (Meg3) and maternally expressed gene 8 (Meg8 or Rian) are overlapping but non-identical transcripts found in the Dlk1-Dio3 imprinted region. Meg3 and Rian were upregulated in fiber cells. Meg3 is expressed in many normal tissues, including the eyes [110]. Previous studies suggested that Meg3 acts as a tumor suppressor and regulates vascularization as well as pattern specification and cell differentiation during ear development [111,112]. Mice carrying a maternal Meg3 deletion die before birth, but no eye phenotype was described [113]. Rian is highly expressed in various tissues in mouse embryos, including the brain, tongue, and liver [114], but the function of Rian is unknown, and, until now, the expression of Rian in the eye had not been reported.
Malat1 is one of the most abundant and conserved long non-coding RNAs. In situ hybridization showed that Malat1 is expressed in many tissues including the lens and in cancers [115,116]. RNA-Seq detected high expression of Malat1 in epithelial cells and fiber cells. A recent study demonstrated that microRNA-9 targets Malat1 transcripts for degradation in the nucleus [117]. Surprisingly, three independent research groups that created targeted mutations in Malat1failed to detect any obvious phenotypes [117-120].
RNA-Seq detected Mirh17hg preferentially in the lens epithelium. Previous studies showed that Mir17hg is the host gene for a polycistronic transcript containing the MIR17–92 cluster, a group of at least six miRNAs that may be involved in cell proliferation, differentiation, and cell survival [121-123]. However, the functional role of Mir17hg in the lens is unknown.
This first comprehensive transcriptome analysis with RNA-Seq in the ocular lens provides a valuable resource for studying lens development, fiber differentiation, and lens pathogenesis. In addition to providing a platform for comparing lens epithelial cell and fiber cell protein coding gene expression, for the first time, the expression patterns of lincRNAs in the lens were characterized. However, the functional significance of these lincRNAs in lens development or physiology remains unknown. Further studies are needed to elucidate their functional significances in lens development and fiber cell differentiation.
Appendix 1: List of specific primers for genes used in RT-qPCR.
Appendix 2. Differential expression of all protein-coding genes.
Appendix 3. Differential expression of lincRNAs.
Appendix 4. Top 50 highest expressed genes in epithelial cells based on RPKM values.
Appendix 5. Top 50 highest expressed genes in fiber cells based on the fiber RPKM values
Appendix 6. Expression of genes involved in TGFβ signaling in lens epithelial cells and fiber cells
Appendix 7. Expression of genes involved in apoptosis and protein degradation pathways
Appendix 8. Expression of genes encoding for aquaporins, connexins and intermediate filaments
We would like to thank the Genomics, Epigenomics and Sequencing Core at The University of Cincinnati, directed by Dr. Shuk-Mei Ho, for next generation RNA sequencing. Dr. Shigekazu Nagata in Kyoto University kindly provided us the microarray data from his study. We also thank Adam S. LeFever for editorial assistance. This work was supported by NIH grants, R01EY010540 and R01EY16707 awarded to Dr. Panagiotis Tsonis. Dr. Michael Robinson (Robinsm5@miamioh.edu) and Dr. Chun Liang (Liangc@miamioh.edu) are co-corresponding authors for this paper.