|Molecular Vision 2002;
Received 31 August 2001 | Accepted 14 December 2001 | Published 15 June 2002
Expressed sequence tag analysis of adult human lens for the NEIBank Project: Over 2000 non-redundant transcripts, novel genes and splice variants
Steven L. Bernstein,2
M. Keith Wyatt,1
Jeffrey W. Touchman,3
1Section on Molecular Structure and Function, National Eye Institute, National Institutes of Health, Bethesda, MD; 2Departments of Ophthalmology and Neurobiology & Genetics, University of Maryland School of Medicine, Baltimore, MD; 3NIH Intramural Sequencing Center, Gaithersburg, MD
Correspondence to: Graeme Wistow, Ph.D., Chief, Section on Molecular Structure and Function, National Eye Institute, Building 6, Room 331,National Institutes of Health, Bethesda, MD, 20892-2740; Phone: (301) 402-3452; FAX: (301) 496-0078; email: email@example.com
Purpose: To explore the expression profile of the human lens and to provide a resource for microarray studies, expressed sequence tag (EST) analysis has been performed on cDNA libraries from adult lenses.
Methods: A cDNA library was constructed from two adult (40 year old) human lenses. Over two thousand clones were sequenced from the unamplified, un-normalized library. The library was then normalized and a further 2200 sequences were obtained. All the data were analyzed using GRIST (GRouping and Identification of Sequence Tags), a procedure for gene identification and clustering.
Results: The lens library (by) contains a low percentage of non-mRNA contaminants and a high fraction (over 75%) of apparently full length cDNA clones. Approximately 2000 reads from the unamplified library yields 810 clusters, potentially representing individual genes expressed in the lens. After normalization, the content of crystallins and other abundant cDNAs is markedly reduced and a similar number of reads from this library (fs) yields 1455 unique groups of which only two thirds correspond to named genes in GenBank. Among the most abundant cDNAs is one for a novel gene related to glutamine synthetase, which was designated "lengsin" (LGS). Analyses of ESTs also reveal examples of alternative transcripts, including a major alternative splice form for the lens specific membrane protein MP19. Variant forms for other transcripts, including those encoding the apoptosis inhibitor Livin and the armadillo repeat protein ARVCF, are also described.
Conclusions: The lens cDNA libraries are a resource for gene discovery, full length cDNAs for functional studies and microarrays. The discovery of an abundant, novel transcript, lengsin, and a major novel splice form of MP19 reflect the utility of unamplified libraries constructed from dissected tissue. Many novel transcripts and splice forms are represented, some of which may be candidates for genetic diseases.
The lens is a highly specialized tissue. Through its transparency and its sophisticated optical properties, which are derived largely from subtle gradients in refractive index , it plays an important role in visual acuity. The lens is also essential for the normal development of other eye tissues, particularly the rest of the anterior segment [2,3]. The mechanisms by which the lens influences and is in turn influenced by other eye tissues are incompletely understood. The lens derives from epithelial ectoderm, but its further development is in the context of the neurally derived optic cup and migrating tissues of neural crest origin [4,5]. Perhaps as a consequence of this heritage, the tissue expresses genes usually associated with both epithelial and neural cells, such as epidermal fatty acid binding protein (eFABP)  and Pax6 . Furthermore, other ocular cell types of neural origin, such as retinal and iris pigment epithelia, can transdifferentiate into lens-like cells under certain conditions .
The lens grows throughout life as cells in an anterior epithelial layer migrate to the lens equator and undergo terminal differentiation into extremely elongated fiber cells [9,10]. These highly specialized cells express large amounts of a few classes of protein, principally crystallins and some characteristic membrane and cytoskeleton proteins. Eventually they lose their cellular organelles, including nuclei, in a maturation process that has some interesting parallels with programmed cell death or apoptosis [11-14]. Fiber cells do not turn over and must therefore survive for decades while exposed to light and the normal insults of aging . Although the embryonic lens is surrounded by a network of blood vessels [10,15], through most of its long life the tissue is avascular, making cell-cell communication essential for transport of nutrients and waste products. Considering the tremendous demands on the tissue, it is perhaps not surprising that for humans, cataract (lens opacity) is an almost inevitable consequence of survival to old age. Worldwide, this is a leading cause of visual impairment and blindness.
Human lens is one of the ocular tissues that have been relatively neglected in previous expressed sequence tag (EST) studies. As part of NEIBank, a project involving genomics and bioinformatics studies of the eye , a cDNA library has been constructed from adult human lens. The un-normalized library has been used for EST analysis to collect representatives of highly expressed lens transcripts. A normalized version has also been created to explore rarer transcripts. For known genes, full length clones and novel splice forms have been identified. Many apparently novel genes have also been identified. This paper will summarize some of the basic findings from the continuing analyses of these libraries, focusing on abundant transcripts and novel splice forms among several important categories, including lens membrane and cytoskeletal components, growth factors, transcription factors, and the machinery of apoptosis. Additional analysis, including further sequencing, will continue as the project progresses. In particular, annotations and enhancements will be added to a full display of the data at NEIBank. Clones from all the NEIBank libraries are being compiled in a large nonredundant set for construction of human eye cDNA microarrays.
cDNA Library Construction
Post-mortem eye tissues were obtained under University of Maryland School of Medicine IRB exemptions SB-019701 and SB-129901. Total RNA was extracted using RNAzol or TRIzol (Tel-Test Inc., Friendswood, TX). PolyA+ RNA was prepared using an oligo(dT) cellulose affinity column  and used for cDNA synthesis in the Life Technologies SuperScript Plasmid System Life Technologies (Rockville, MD, now part of Invitrogen Corp, Carlsbad, CA; Invitrogen Corp.). First strand synthesis was carried out using a Not I primer-adapter [GACTAGTTCTAGATCGCGAGCGGCCGCCC(T)15] and SuperScript II reverse transcriptase. After second strand synthesis, using E. coli DNA polymerase, Not I/blunt end inserts were directionally cloned into the Not I/EcoR V sites in the pCMVSPORT6 vector. Plasmids were electroporated into DH10B cells. A sample of clones was picked for quality control analysis.
An additional procedure was used to remove excess empty vector clones from the libraries. Plasmid DNA was harvested from 106 primary clones grown in liquid culture at 30 °C under conditions designed to minimize the effects of differential growth rates . Gel purification was used to isolate 2 mg of plasmid DNA larger than the size of empty vector. This was reintroduced to host cells as before and the libraries were re-analyzed.
For normalization, the library was first amplified using a semi-solid procedure to minimize representational bias . This was performed at 30 °C to improve clone stability . Primary cDNA transformants (4-6x105) were added to bottles containing semi-solid agarose at 37 °C. Bottles were cooled on ice for 1 h then incubated at 30 °C for 40-45 h without disturbance. Contents were centrifuged at room temperature and cells were resuspended in 2X LB glycerol. Aliquots were analyzed for colony estimate and insert size.
The amplified library was then normalized by self-subtraction. One portion of double stranded plasmid DNA representing the library was linearized by NotI. This NotI digested library was used as a template for biotinylated RNA synthesis using SP6 RNA polymerase. Another portion of the double stranded plasmid library was converted to single-stranded circles in vitro using Gene II and Exonuclease III (Life Technologies). Single-stranded DNA (1 mg) was hybridized (C0t 500) with 41 mg of Bio-RNA and vector blocking oligonucleotides. The hybridized Bio-RNA/ss-circles were removed by streptavidin:phenol extraction. Processing of the normalized library was then completed by standard methods [20,21]. Content of elongation Factor 1a, a highly abundant component of most cDNA libraries, was monitored by hybridization to judge the effect of normalization process. Normalization was carried out at Life Technologies.
High throughput sequencing was performed at the NIH Intramural Sequencing Center (NISC). For EST sequencing, individual clones were inoculated in 1.2 ml of Terrific Broth (Quality Biological Inc. Gaithersburg, MD) containing 100 mg/ml ampicillin (in a 2 ml wells in 96 well plates) and incubated with agitation at 37 °C for 20-24 h. Plasmid DNA was prepared by alkaline lysis and DNA was suspended in 50 ml of TE (10 mM Tris-HCL, pH 7.5; 0.1 mM EDTA). Fluorescent DNA sequencing reactions were performed using M13 forward (GTTTTCCCAGTCACGAC) or reverse (CAGGAAACAGCTATGACC) primers and BigDye terminator sequencing kit (PE Applied Biosystems, Foster City, CA). The products were analyzed on ABI 377 and 3700 automated fluorescent sequencers (PE/Applied Biosystems). Full insert cDNA sequencing was performed on selected clones by a primer-walking sequencing strategy until the sequence of both strands of the cDNA was determined. The sequence was edited and assembled using the program Sequencher (Gene Codes, Ann Arbor, MI).
ABI sequence data was analyzed using PHRED  to identify and trim quality reads. Vector, E. coli genome, and human mitochondrial sequences were trimmed or eliminated using the programs RepeatMasker (by Arian Smit and Phil Green) and CrossMatch (by Phil Green) as described previously . High quality cDNA reads were analyzed using BLAST  (National Center for Biotechnology Information (NCBI), National Library of Medicine, Bethesda, MD) to compare with GenBank nucleotide sequences (NT), protein sequences (NR) and dbEST. Sequences were also BLASTed against the other clones in the library dataset to identify overlapping clones.
Linker sequences and cloning artifacts that survive prior processing were removed by a custom program, rmlinker , which uses an updateable list of observed artifacts to trim iteratively from the 5' end of each sequence, to remove sequences that have less than 50 bp of unmasked sequence and also, after initial BLAST runs, to remove any clones identified as non-mRNA. Custom software, GRIST , was used to group and identify the ESTs based on BLAST results. For validation and examination of individual clusters, sequences were also aligned using SeqMan II (DNASTAR, Madison, WI).
Polymerase chain reaction (PCR) methods
PCR was used to validate alternative splice forms, to obtain probes for hybridization, and to complete sequences. For template, a sample of the complete cDNA library representing at least one million primary clones was amplified and plasmid was isolated using reagents from QIAGEN (Valencia, CA). Fragments were amplified using either Taq (Roche, Indianapolis, IN) or Elongase (Life Technologies) polymerase systems and following manufacturer's protocols.
Radiation hybrid mapping was performed at Research Genetics (Huntsville AL), using the Stanford G3 panel. PCR was used to amplify unique marker sequences from a total of 83 clones and two controls. PCR primers were designed from the deduced 3' UTR of the target sequence, a region unlikely to be interrupted by introns in the genome. For lengsin, the primer pair GCCACCCAAATTGGATATT and ACAACAAAGTTGTGGTGAAGC was used to produce a specific 107 bp band in human genomic DNA. An email server operated by the (Stanford Human Genome Center) was used to link the marker to more than 15000 framework markers.
Results & Discussion
Un-normalized library (by):
Lenses from two adults were used to construct a cDNA library. The number of primary recombinants was estimated as 2.1x108, with an average insert size of 1 kbp. Although the quality of the cDNA inserts, as judged by initial quality control (QC) sequencing, was very good, the relatively small amount of mRNA used resulted in an unacceptable fraction (almost 50%) of empty vector clones after cDNA cloning. To improve sequencing efficiency, a special procedure was used to remove vector sized clones through electrophoresis and gel purification. An aliquot of 106 primary recombinants was grown at low temperature in agarose to minimize effects of differential growth rate . Plasmid was harvested and gel purification was used to remove DNA of the size of empty vector. After modification, empty vector clones accounted for only 14% of the total quality reads. Analysis suggested that the relative abundance of cDNAs in the modified library had not been seriously affected. This was based on comparisons between quality control plates for both the original and the modified library and on the overall content of crystallins and other lens specific (or preferred) transcripts of varying sizes.
Approximately 2100 clones from this library, designated by, have been analyzed. Clones are named using these key letters followed by their grid position in 96 well plate format. All have been sequenced from the 5' end and some from both directions. The average high quality read length is 470 bp. Approximately 2% of the sequenced clones are derived from mitochondrial genome and 3% from rRNA. As expected, the library contains a high fraction of crystallin clones, accounting for over 18% of the total. This abundance of fairly short crystallin transcripts contributes to the average insert size of 1 kbp, which is lower than in some other libraries. However it may also contribute to the high proportion of apparently full length clones in this library. As judged by the presence of the start site codon in clones of known identity, (essentially the same criteria as used in "full length" cDNA projects ) 76% of cDNA inserts in the by library contain the complete coding sequence. At this time, approximately 10% of the clones have no identifiable match with any known sequences in GenBank or in Unigene. This number is subject to change as the data are regularly reanalyzed and as more human genes are identified through the genome projects. After informatics analysis using GRIST , the sequences from the by library form 810 clusters (containing one or more cDNA clones), each of which potentially represents a unique gene. This is a relatively low number that again reflects the abundance of crystallins, along with other transcripts such as ribosomal proteins, elongation factor 1α, and clusterin/DNAJ that are typically present at high levels in most cDNA libraries. To obtain a broader representation of lens transcripts, the lens library was subjected to normalization. However, even in by, the majority of the GRIST clusters are singletons, that is, genes represented by a single cDNA clone.
Normalized lens library (fs)
Normalization preferentially subtracts abundant transcripts from a library, removing information about true transcript abundance but giving access to genes expressed at lower levels . In the normalized version of the lens library, designated fs, the average insert size of the library has increased to 1.4 kbp, reflecting the reduction in abundance of crystallins and other short transcripts. One measure of the success of normalization is comparison of the level of cDNA for elongation factor 1α(EF1α) as estimated by dot blot hybridization of an aliquot of DNA from the amplified and unamplified libraries (not shown). The abundance of EF1αis approximately 60 fold lower in fs than in by. The effects of normalization are also obvious from EST analysis. So far, 2200 clones from fs have been sequenced.
The most abundant clones from by, such as crystallins, are dramatically reduced in their occurrence in fs. For example, for similar numbers of sequences, the content of βB2-crystallin declined from 84 clones to 2, and αB declined from 81 to zero. A few other transcripts are moderately abundant in fs. These include ferritin, which is abundant in both by and fs with 15 and 13 clones, respectively. Since normalization involves hybridization between single stranded nucleic acids, secondary structure may affect its efficiency. Interestingly, ferritin mRNA is known to contain hairpin structures that are involved in post-transcriptional regulation . Conceivably, such hairpins may form during normalization, perhaps inhibiting hybridization and thereby preventing the removal of such transcripts.
However, overall the fs library is clearly much flatter in its distribution than by. The 2200 sequence reads produced 1455 unique clusters after GRIST analysis, with over 83% of the clusters singletons. Thus the normalized library contained a much wider variety of cDNAs, including a number of potentially novel clones. Indeed, certain genes known to be expressed in lens but absent from the initial sampling of by clones, such as the transcription factors Pax6 and Sox2, were found in the fs set. Full details of the clones from by and fs are available through NEIBank.
The 25 most abundant transcripts in the un-normalized (by) adult human lens library are shown in Table 1. As expected, the list is dominated by crystallins, with ten of the twelve human crystallins represented in the library ranked among the top 25. The two α-crystallins, αA and αB, are each represented at a level of about 5% of the cDNAs in the library. All six β-crystallins, βA3/A1, βA2, βA4, βB1, βB2, and βB3, are present and βB2 is the single most abundant cDNA in the library. The high representation of βA2 is unexpected and is under further investigation. For the γ-crystallins, by far the most highly represented cDNA is that for γS, the member of the family whose expression in lens increases after birth [29,30]. With 41 clones, γS represents over 2% of the total number of cDNA clones sequenced. In addition to γS, mammalian genomes typically contain six γ-crystallin genes, γA-F, which are expressed primarily in the embryonic lens [31-33]. In humans, γD seems to be expressed at the highest levels while γE and γF are inactive pseudogenes . The by collection contains six copies for γD and one copy each for γB and γC. No ESTs for γA are found in either by or fs collections. Although these results are clearly consistent with previous analyses [31-33], they do not tell us whether the γ-crystallins are still being expressed in the 40 year old lens since the ESTs may represent survival of fiber cell transcripts from a much earlier developmental stage.
The single most striking feature of the vertebrate lens is the enormous elongation of its fiber cells [4,5]. The peculiar demands of lens cell structure, maturation and movement, are reflected in specialized cytoskeleton proteins. The lens has two highly tissue specific cytoskeleton proteins, filensin and phakinin (BFSP1 and BFSP2), that are components of the characteristic beaded filaments of the fiber cells . Both of these transcripts are among the top 25 most abundant cDNAs in the by collection (Table 1). Two other cytoskeletal proteins, vimentin, (with 7 ESTs) and α-tubulin (with 4 ESTs) are also highly represented, along with cDNA for several proteins involved in dismantling cytoskeleton, such as the actin depolymerizing proteins destrin and cofilin . A list of cytoskeleton related transcripts identified so far in the combined by and fs datasets is shown in Table 2. Interestingly, these include a number of components of actin related transport systems including dynactins and dyneins, proteins that are more usually associated with axons , while contractile proteins such as myosins and caldesmon are also represented. These probably form part of the machinery by which maturing lens cells migrate along the lens capsule basement membrane . Members of the Rho family of small GTPases, proteins that are involved in cytoskeletal rearrangements , are also present and are listed in Table 2.
Armadillo repeat proteins
The cytoskeleton forms active connections with many other signaling and structural mechanisms in the cell. In particular, there are important links between the cytoskeleton and cell adhesion complexes. One gene whose transcripts are actually increased in abundance in the fs library is Armadillo Repeat Gene Deleted in Velocardiofacial syndrome (ARVCF). ARVCF encodes a protein related to β-catenins, proteins that play an important role in the formation of adherens, junction complexes, and in transcriptional activation, mediating Wnt signal transduction [39,40]. The ARVCF protein contains two major regions, a relatively short N-terminal coiled-coiled domain and a long C-terminal region composed of ten armadillo repeats that are characteristic of the family. Both these structural motifs are thought to be involved in protein-protein interactions and armadillo repeat proteins are known to mediate interactions between the cell adhesion complexes formed by cadherins and the actin cytoskeleton. As its name implies, ARVCF is deleted in velocardiofacial syndrome (OMIM: 192430), a condition with multiple abnormalities, which, interestingly, may include bilateral cataract.
The ARVCF gene is found on chromosome 17, divided into 19 exons (NT_011519.9). Five of the six ESTs for ARVCF from lens are almost full length. One of these matches the sequence of the published cDNA , but surprisingly the other four are all variant transcripts that arise from alternative exons in the third intron of the ARCVF gene (Figure 1). Three of the ESTs use an alternative exon (A), which splices to exon 4, while the fourth EST is produced by splicing from exon A to another alternative exon (B) and then on to exon 4. Exon B is 33 bp and can therefore maintain the ARVCF ORF. The available sequence for exon A also contains contiguous ORF (Figure 1), but since the authentic 5' end of the variant transcripts is not yet apparent, it is not clear what the complete sequence of the variant protein would be and whether exons 1-3 of ARVCF are included. The novel sequence from exons A and B is highly charged, with twelve strongly basic residues and seven acidic residues, but has no strong similarities to known proteins. The variant protein may have replaced or modified the coiled-coiled domain, thereby altering its interactions with other proteins. Since the variant forms of the ARVCF transcript seem to hold the majority in lens, albeit in the normalized library, it will be interesting to see if variant ARVCF proteins have a particular role in lens adherens junctions and their interactions with cytoskeleton. The discovery of alternative exons for this gene also provides new sequences to search for mutations underlying genetic defects that map to this locus, particularly those that involve cataract.
Another armadillo repeat protein, plakophilin 4/p120(ctn) (PKP4), which is also a component of adherens junctions , is moderately abundant in the un-normalized by collection of cDNAs where it is represented by two ESTs.
One of the most characteristic and tissue specific proteins of the lens is MIP (major intrinsic protein) the founder member of the MIP/ aquaporin family of channel proteins [9,41]. A first view of the by collection found only a single EST for MIP. However, further analysis of a group of three, apparently novel, lens specific clones clustered by GRIST (by01f11, by16b10, by18b10) revealed that they derive from previously unidentified 3' UTR regions of MIP (Unpublished data), giving a total of four clones for MIP in the set sequenced. In addition, and somewhat surprisingly, two clones were found in the by collection and two more were found in the normalized fs collection, for aquaporin 5, another member of the same gene family. AQP5 has previously been detected only at low levels in rat lens ; it remains to be seen if it makes an important contribution to channels in the adult human lens. As for the classical gap junction proteins, cDNAs for connexins 37 and 50 are represented among the clones sequenced from the normalized lens fs library. Previously, connexin 50 has been identified as the locus of zonular pulverulent cataract in humans .
Alternative splicing of MP19
Although MIP is usually regarded as the most abundant lens membrane protein, by far the most abundant cDNAs for a lens membrane protein in the un-normalized by collection are for MP19. MP19 (or Lim2) is a protein of size variously described as 17, 19, or 20 kDa in different species [44-48], that belongs to a large superfamily of integral membrane proteins with four transmembrane helices, sometimes called tetraspanins . MP19 appears to be highly lens specific, and in mouse, its gene, lim2, is the locus of the To3 genetic cataract [50,51].
The by EST collection contains 14 clones for MP19, 0.8% of the total number of sequences. When these are aligned, 4 of the 12 cDNAs that are full length or nearly full length can be seen to contain an insertion of 126 bp in the protein coding sequence. Complete cDNA sequences for both canonical human MP19 and the novel insert form, MP19ins, were obtained from these clones and have been submitted to GenBank (Accessions: AF340019 and AF340020). The MP19ins insertion is in frame and increases the size of the ORF by 42 amino acids residues (Figure 2A), so that while the full size of human MP19 is 172 residues, corresponding to a size of 19.5 kDa, the insert form, MP19ins, is 214 residues (24.1 kDa). Interestingly, early work on MP19 previously suggested the existence of larger variant forms in lens, including a ~25 kDa species that cross reacts with MP19 antisera [46-48], although the identity of such variants was not established. The existence of MP19ins probably accounts for these observations.
The inserted sequence lengthens the predicted extracellular loop between the first two transmembrane helices of MP19 (Figure 2A) with a hydrophilic polypeptide that contains eight strongly basic (arg/lys) residues and 11 other polar residues. This charge distribution is reflected in the predicted pI for the two forms of the protein, which are 9.6 for MP19 but 10.1 for MP19ins. Although tetraspanins are widespread, their functions are not well understood . This means that the consequences of the insertion in MP19 are not easy to predict. Lens transparency requires close packing of fiber cells in orderly arrays and efficient communication between the cells [9,52]. It is reasonable to expect that MP19, as a tissue-specific, abundant, integral membrane protein of the lens, has some involvement in these important processes. If this is so, the insert sequence of MP19ins probably modifies cell contacts required for organization or communication between lens cells. It will be interesting to see if future studies reveal any specific distribution of MP19ins in the lens.
Examination of the MP19 gene sequence  (GenBank Accession: L04193) shows that the insertion results from alternative splicing of the gene transcript. As shown in Figure 2B, this occurs when the previously identified 3' splice junction of exon 2 is skipped so that exon 2 extends through an additional region, designated exon 2ins, to an alternative, canonical splice junction that splices to exon 3. The insert sequence is an in frame, unbroken addition to the ORF of the gene.
Enzymes, the workhorses of the cell, are often overlooked in analyses of tissue expression patterns in favor of other groups, such as transcription factors and receptors. However they are clearly of great importance and, for the lens, they have special significance in some key areas. In many vertebrate species, some major crystallins have turned out to be enzymes that are overexpressed in a tissue specific manner to acquire new, structural roles [54-57]. Enzymes have also been implicated in two important processes related to cataract, namely protection against oxidation [9,58,59] and proteolysis of lens proteins .
Indeed, several enzymes are among the most abundant transcripts in the by library dataset (Table 1). Two of these, α-enolase and glyceraldehyde 3-phosphate dehydrogenase (GAPDH), have separately been recruited as taxon specific crystallins in different vertebrate species . Their abundance in the human lens, where they are not thought to have a particular structural role, is consistent with the idea that novel crystallins have arisen during evolution through gene recruitment of proteins that are already expressed at moderate levels in the lens and had existing, useful roles in the tissue [56,57].
Useful roles are easy to envisage for three other enzymes, glutathione synthetase, carbonyl reductase 1 (CBR1) and superoxide dismutase 1 (SOD1), that are highly abundant in the by collection. The levels of cDNA for these enzymes probably reflect the critical role of anti-oxidation mechanisms (see below) in a tissue that is peculiarly vulnerable to oxidative damage during exposure to light over a long period of time . Another relatively abundant enzyme in the lens is prostaglandin D synthetase. Interestingly, this enzyme is also abundantly represented among cDNAs from human iris . This may reflect a common role for both these anterior segment tissues in synthesis of prostaglandin D, a factor which is implicated in control of intraocular pressure [62,63].
Among the most abundant class of clones in the by collection is a novel gene transcript. In initial analysis, three overlapping clones were found. One of these was completely sequenced and allowed the detection of three additional, shorter clones in the collection. The complete cDNA sequence derived from these clones is shown in Figure 3. Later comparisons with genome sequence identified a seventh clone in by that may derive from a long 3' UTR of this transcript, so that cDNAs from this novel gene represent over 0.3% of the total from this library. In dbEST, the only other representations of this gene (Unigene cluster HS. 149585) presently come from whole embryo and from the Cancer Genome Anatomy Project tumor libraries, which are not good indicators of tissue specificity. Overall, the expression patterns at the level of ESTs are consistent with the possibility that the new gene may be lens preferred in normal tissue.
The full length sequence of the novel transcript contains a long ORF of 509 codons (Figure 3). While the first 80 residues have no significant match with other sequences in the databases, the remainder of the sequence has 25-29% identity with various members of the glutamine synthetase (GS) superfamily. Interrogation of protein motif databases reveals three regions that match profiles for this superfamily (Figure 3). From these relationships the name lengsin, for lens glutamine-synthetase-like (LGS) was chosen.
The GS superfamily is represented widely in both prokaryotes and eukaryotes. Of three identified classes, only class II has so far been observed in eukaryotes, while class I (GS-I) has so far been restricted to prokaryotes [64,65]. Interestingly, LGS most closely resembles GS-I sequences. The closest match from BLASTX search is 29% identity, over a range of more than 400 amino acid residues, with a GS-I member, a putative amino group transfer enzyme of Pseudomonas putida (Figure 4). Slightly lower similarities are seen with many other GS-I sequences. This raises the possibility that LGS may be the first example of non-class II GS member in vertebrates. Indeed, using the sensitive Gribskov Z score analysis for detecting conserved motifs , a weak but significant match with the characteristic adenylation site sequence of GS-I is detected in the LGS sequence (Figure 3).
In spite of its general similarities to bacterial proteins, LGS is indeed a human gene. RH mapping localizes the gene to the centromeric region of human chromosome 6 (6q11.2), with a LOD score of 16.45 for marker SHGC-13396. This localization is confirmed by sequence data from the human genome project. Interestingly, Leber's Congenital Amaurosis 5 (LCA5), a genetic disorder which has multiple effects, including cataract (OMIM:604537) maps to the same region . The major form of LGS transcript is divided into 4 exons, although there is also preliminary evidence of alternative splicing. Further analysis of this novel gene will be reported in detail elsewhere.
Oxidation related transcripts
As described above, some of the abundant transcripts of the lens are enzymes with roles in protection against oxidative damage. For obvious reasons, the lens and its proteins experience tremendous exposure to light, including UV radiation, and this has long been regarded as a potential source of oxidative insult that may contribute to cataract [9,58,59]. Table 3 lists oxidation related transcripts (mainly enzymes) that are identified so far in the combined by and fs sequence collections. The glutathione system is perhaps the major defense against oxidation in the lens [9,68]. Table 3 lists several enzymes of glutathione metabolism, including glutathione synthetase, four different glutathione peroxidases (GPx1, 3 and 4, and non-selenium glutathione peroxidase/anti-oxidant protein 2), and two glutathione S-transferases of different classes (GST-M5 and pi) . Table 3 also lists several enzymes related to another antioxidant, thioredoxin, that functions in coordination with glutathione. So far, no transcripts for thioredoxin itself have been identified, but the closely related "thioredoxin-like protein" (TXNL/TRP32)  is present, along with a thioredoxin reductase , a thioredoxin peroxidase, and two peroxiredoxins . The lens sequence collection also contains a cDNA for epoxide hydrolase 2, a cytoplasmic enzyme involved in protection against the toxic products of fatty acids .
A redox/clock connection?
While the redox environment of the lens is clearly important in terms of protection of lens components from oxidative damage, a surprising new cellular mechanism connecting oxidation state to circadian rhythms and gene expression has recently emerged. The transcription factors Clock and NPAS2 and the flavin containing cryptochromes participate in an oscillating system of feedback loops that control the diurnal cycle in mammalian cells [74,75]. Both NPAS2 (clone fs08909) and cryptochrome 1 (clone by07b01) are represented in the lens sequence collection, suggesting that the lens does have the necessary mechanisms for daily cycling. Surprisingly, it now turns out that the NPAS/cryptochrome system is closely controlled by the ratio of oxidized to reduced NAD(P) cofactors in the cell . NPAS in turn controls expression of metabolic enzymes, such as lactate dehydrogenase, that control levels of NAD(P)H . As a tissue exposed to light, it makes sense that the lens could benefit from an internal clock, perhaps to control the activity of defensive systems. If this cycle is dependent on the oxidation state of the tissue, it is possible that oxidative stress in the lens could perturb important mechanisms of gene expression and thereby contribute to the deleterious effects of the insult.
In general, the lens seems to provide an environment that promotes protein longevity. Turnover of proteins, particularly crystallins, is very low and in the differentiated fiber cells proteins can survive for decades . However, protease modification of crystallins and other major lens components does occur [60,76], perhaps as a programmed maturation or else as part of a process leading to cataract. Table 4 lists some proteases and protease inhibitors that are identified in the combined by and fs lens cDNA sequence collections. Calpains in particular have been intensively studied as key agents in certain models of cataract development . Two members of the calpain family are listed in Table 4. Calpains are cysteine proteases related to the enzyme papain . Another major group of cysteine proteases are the cathepsins . Four members of this family are also represented in the lens EST collection. In spite of the low turnover of lens proteins, components of the ubiquitin/proteosome machinery for tagging and elimination of damaged proteins [79,80] are also represented in the lens.
Nucleic acid binding proteins and transcription factors
Among the most abundant cDNAs in the by collection are two nucleic acid binding proteins, DNA binding protein A (7 ESTs) and calcium regulated heat stable protein (CRHSP-24; 3 ESTs). Both these proteins contain the cold shock domain (CSD), a variant RNA binding motif capable of specific interactions with nucleic acids [81,82]. One of them, CRHSP-24, is a calcineurin substrate and is thought to play a role in calcium mediated signal transduction . Calcineurin itself is also represented in the lens sequence collections.
At the cDNA level, the most abundantly represented transcription factor is Jun-D, with three ESTs in the by dataset (Table 5). Jun-D is a component of AP-1 transcriptional activity that has frequently been implicated in lens gene expression [84,85]. Two ESTs each were identified for heat shock factor 1, another protein involved in expression of some crystallins , and for NF-E1/YY1, a ubiquitous component of the transcriptional machinery . In by, single hits are observed for several other transcription factors including two that are important for lens development and function. These are Six3, a member of the sine oculis family , and C-maf, [89-91]. Maf family proteins are involved in tissue specific crystallin gene expression [89,92,93]. A related factor, BACH1, which is capable of heterodimerization with Maf proteins and binding to the Maf-response element (MARE) , is also present.
The collection of sequences from the normalized lens fs library contains cDNAs for additional transcription factors that have important roles in the lens (Table 5). In particular these include the "master gene" Pax6 , Sox2 , FOXE3 , HMX1 , and PITX3 . Two additional Sox family members, Sox13 and Sox22, are detected in the fs collection; Sox22 has previously been found in both optic and otic vesicles in human embryo . No role for Sox13 in eye seems to have been noted, but the gene is known to be expressed in pancreatic cells . Intriguingly, this is an extra-ocular location shared with Pax6 .
One transcription factor represented in both by and fs collections is sterol regulatory element binding transcription factor 1 (SREBF1). Therapeutic use of steroids is a major cause of cataract . Although the cause of the resulting opacity is not firmly established, transcription factors involved in steroid response could certainly play a role. Another much more enigmatic transcript that may also be related to steroid response is represented by 2 ESTs in the by collection. These ESTs match a partial sequence described in GenBank as glucocorticoid receptor AF-1 coactivator-1 (GenBank Accession AF173358). One clone for this cDNA from lens (by01b08) was completely sequenced and deposited in GenBank (GenBank Accession: AF353674). The lens clone is 1887 bp and contains a partial ORF of 1430 bp (Figure 5). The predicted protein sequence includes a BTB domain, a structure found in some transcription factors, including the Maf-interacting protein BACH1 . The 126 amino acid BTB domain of the lens BTB domain protein (LBDP) is flanked by a N-terminal proline rich region and by a C-terminal region that are both well conserved with other BTB domain proteins. These regions contain a number of potential phosphorylation and glycosylation sites.
The gene for LBDP is located in the human genome on chromosome 14q32 inside an intron of the gene for the polymerase III factor BRF (GenBank Accession: NM_001519), but on the complementary DNA strand. 14q32 is the mapped location for Usher syndrome type 1A (OMIM: 276900), an inherited disorder that includes cataract along with defects in several other tissues. Considering its expression pattern, LBDP is a plausible locus of this disease. Unfortunately, completion of the cDNA sequence and definition of the gene has been made difficult because of the very high G+C content of the 5' part of the transcript.
Growth factor mediated communication between the lens and the rest of the eye is essential, both for the lens itself and for the normal development of the whole eye [104,105]. The initial sampling of cDNAs from by identified two clones each for connective tissue growth factor (CTGF) and for transforming growth factor (TGF) β1. Both these factors have been implicated in anterior polar cataracts . Members of the TGFβ/BMP family are known to have important roles in lens and eye development [107,108] and, in addition to TGFβ1, a cDNA for bone morphogenic protein 6 (BMP6), is among the sequences obtained from fs. Table 6 lists several other growth factors, binding proteins and receptors identified so far in the combined by and fs datasets
Several growth factor receptors, including FGF receptor 2, FGF receptor 3, and the EGF receptor, are present. Both FGF and EGF family members have been shown to affect differentiation of lens cells [105,109]. One EST (fs11h06) seems to be derived from a potentially novel 3' end of the Met-oncogene/hepatocyte growth factor receptor, which has been identified in human lens epithelial cells . In addition, two ESTs in the fs set encode "hypothetical protein FLJ22357," a new and uncharacterized protein whose entry in GenBank describes it as being "similar to EGF receptor related protein" (GenBank Accession: NP_071895). This relationship appears to be based on similarity to the Drosophila protein rhomboid, which has a role in activation of EGF receptor signaling in development .
Programmed cell death, or apoptosis, is an important process in all tissues, including the lens . In addition, the normal maturation of lens fiber cells has some striking parallels with features of apoptosis, most notably seen in the condensation of fiber cell nuclei and the breakdown of genomic DNA in the typical nucleosome ladder pattern [12-14]. The combined by and fs datasets contain several apoptosis related factors (Table 7). Both collections contain two ESTs for the recently described Livin (or KIAP), a protein of the "baculovirus inhibitor of apoptosis protein (IAP) repeat" or BIR family, which until now has only been detected in melanoma cells and some fetal tissues. Like other IAP/BIR proteins, Livin blocks the apoptotic effects of several factors, including Bax and TNFα, and has been shown to bind caspases [112,113].
The gene for Livin can be found in the human genome on chromosome 20q13.33 (GenBank Accession: NT_011333), where it is divided into seven exons. Exons 1 and 2 contain sequences for the conserved IAP/BIR domain. Of the four lens ESTs, one (by15h01) contains a variant splice. Sequence for this EST begins in intron 2 and runs on into exon 3, followed by normal splicing through the rest of the coding sequence. Examination of gene sequence shows that the variant EST contains an almost complete ORF (only 1 bp short) that runs in frame with the rest of the Livin coding sequence (Figure 6). This alternative transcript could therefore produce a variant protein lacking the conserved IAP domain that is essential for caspase binding, but retaining the C-terminal part of the protein, a region that contains a conserved RING domain, a structure that is responsible for subcellular localization and ubiquitination [112,114]. The functional consequences of such a modification remain to be determined, but conceivably the truncated protein could have a dominant negative effect, perhaps maintaining a key protein-protein interaction through the RING domain but lacking the activity conferred by the IAP domain. Whatever the function of this variant form, it is clear that the normal human lens is a significant site of Livin expression. Considering the parallels noted between lens cell maturation and apoptosis, it will be interesting to see what part Livin plays in lens cells and their unusual differentiation pathway.
This initial EST analysis of adult human lens has produced a rich resource for future work. Clearly, full description of the whole dataset is beyond the scope of one manuscript. The complete set of clones is collected in web format NEIBank and an effort has begun to add layers of annotation and keywords to the database so that many functional classes beyond those described here can be displayed. As for other genomics related projects, complete analysis of this kind of dataset is a continually moving target. Each new release of GenBank can reveal more about the identity or function of any of the gene clusters we have identified, particularly those that are either novel or members of Unigene clusters of unknown function. This means that clusters and annotations will change as the data are reanalyzed.
The data have already led to the discovery of novel genes and splice variants. One example of gene discovery is lengsin (LGS). At the cDNA level, LGS is among the most abundant transcripts of the adult lens, although there was no prior indication of its presence in the tissue. Preliminary evidence suggests that lengsin is indeed highly lens preferred and, from its superfamily relationships with glutamine synthetases, it seems likely that it is involved in amino group transfer, but its substrate remains unknown. Why would lens have the need for a tissue specific enzyme? Considering the stresses to which the lens is exposed, one possibility is that lengsin has a specialized protective role. It might also serve to modify some other lens specific component, perhaps crystallins, but so far its function is a mystery. It is also worth remembering that, as other work on the lens has shown, proteins such as enzymes may have multiple, unexpected functions .
The discovery of an abundant new splice form of MP19/Lim2 shows that even familiar genes may have unexpected tricks to play. At least in the adult lens, a significant fraction of MP19 transcripts include a longer version of exon 2 that increases the size of an extracellular domain. The identification of full length clones for both versions of the MP19 transcript will allow us to express and examine the properties of the variant protein. Variants of other gene transcripts, such as ARVCF and Livin, also show that even if the number of human genes is lower than expected, the repertoire of proteins that can be produced is much greater. However, we have also noticed that many of the unidentified sequences in our collections do indeed appear to be the products of real genes, with exon/intron structure, even though they are unidentified in current versions of the human genome sequence. Thus, the number of human genes may turn out to be larger than the lower estimates after all. A major focus of future work will be the identification and validation of the unidentified novel genes revealed by the EST analyses. Novel or known, all of the sequences from this analysis have been collected as part of a large, nonredundant set of transcripts from human eye that are being used to create cDNA microarrays.
SLB is supported by the V. Kann Rasmussen Foundation (Denmark) and is a Career Development Awardee of Research to Prevent Blindness (RPB). We thank Dr. Weinu Gan for cDNA sequencing and Ray Tabios for technical assistance.
1. Bettelheim FA. Physical basis of lens transparency. In: Maisel H, editor. The Ocular lens: structure, function, and pathology. New York: Dekker; 1985. p. 265-300.
2. Thut CJ, Rountree RB, Hwa M, Kingsley DM. A large-scale in situ screen provides molecular evidence for the induction of eye anterior segment structures by the developing lens. Dev Biol 2001; 231:63-76.
3. Beebe DC, Coats JM. The lens organizes the anterior segment: specification of neural crest cell differentiation in the avian eye. Dev Biol 2000; 220:424-31.
4. Grainger RM. Embryonic lens induction: shedding light on vertebrate tissue determination. Trends Genet 1992; 8:349-55.
5. Piatigorsky J. Lens differentiation in vertebrates. A review of cellular and molecular features. Differentiation 1981; 19:134-53.
6. Jaworski C, Wistow G. LP2, a differentiation-associated lipid-binding protein expressed in bovine lens. Biochem J 1996; 320:49-54.
7. Li HS, Yang JM, Jacobson RD, Pasko D, Sundin O. Pax-6 is first expressed in a region of ectoderm anterior to the early neural plate: implications for stepwise determination of the lens. Dev Biol 1994; 162:181-94.
8. Kodama R, Eguchi G. From lens regeneration in the newt to in-vitro transdifferentiation of vertebrate pigmented epithelial cells. Semin Cell Biol 1995; 6:143-9.
9. Harding JJ, Crabbe MJC. The lens: Development, proteins, metabolism and cataract. In: Davson H, editor. The Eye, vol 1B. 3rd ed. Orlando (FL): Academic Press; 1984. p. 207-492.
10. Bron AJ, Vrensen GF, Koretz J, Maraini G, Harding JJ. The ageing lens. Ophthalmologica 2000; 214:86-104.
11. Wride MA. Minireview: apoptosis as seen through a lens. Apoptosis 2000; 5:203-9.
12. Beebe DC, Vasiliev O, Guo J, Shui YB, Bassnett S. Changes in adhesion complexes define stages in the differentiation of lens fiber cells. Invest Ophthalmol Vis Sci 2001; 42:727-34.
13. Ishizaki Y, Jacobson MD, Raff MC. A role for caspases in lens fiber differentiation. J Cell Biol 1998; 140:153-8.
14. Dahm R. Lens fibre cell differentiation - A link with apoptosis? Ophthalmic Res 1999; 31:163-83.
15. Barishak YR. Embryology of the eye and its adnexae. Dev Ophthalmol 1992; 24:1-142.
16. Wistow G. A project for ocular bioinformatics: NEIBank. Mol Vis 2002; 8:161-3 <http://www.molvis.org/molvis/v8/a22/>.
17. Simms D. mRNA isolation for high quality cDNA. Focus 1995; 17:39-42.
18. Hanahan D, Jessee J, Bloom FR. Plasmid transformation of Escherichia coli and other bacteria. Methods Enzymol 1991; 204:63-113.
19. Kriegler M. Gene Transfer and Expression: A Laboratory Manual. New York: Stockton Press; 1990.
20. Li WB, Gruber CE, Lin JJ, Lim R, D'Alessio JM, Jessee JA. The isolation of differentially expressed genes in fibroblast growth factor stimulated BC3H1 cells by subtractive hybridization. Biotechniques 1994; 16:722-9.
21. Swaroop A, Xu JZ, Agarwal N, Weissman SM. A simple and efficient cDNA library subtraction procedure: isolation of human retina-specific cDNA clones. Nucleic Acids Res 1991; 19:1954.
22. Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8:186-94.
23. Bouffard GG, Iyer LM, Idol JR, Braden VV, Cunningham AF, Weintraub LA, Mohr-Tidwell RM, Peluso DC, Fulton RS, Leckie MP, Green ED. A collection of 1814 human chromosome 7-specific STSs. Genome Res 1997; 7:59-64.
24. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215:403-10.
25. Wistow G, Bernstein SL, Touchman JW, Bouffard G, Wyatt MK, Peterson K, Gao J, Buchoff P, Smith D. Grouping and identification of sequence tags (GRIST): Bioinformatics tools for the NEIBank database. Mol Vis 2002; 8:164-70 <http://www.molvis.org/molvis/v8/a23/>.
26. Strausberg RL, Feingold EA, Klausner RD, Collins FS. The mammalian gene collection. Science 1999; 286:455-7.
27. Bonaldo MF, Lennon G, Soares MB. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 1996; 6:791-806.
28. Theil EC. The IRE (iron regulatory element) family: structures which regulate mRNA translation or stability. Biofactors 1993; 4:87-93.
29. Sinha D, Esumi N, Jaworski C, Kozak CA, Pierce E, Wistow G. Cloning and mapping the mouse Crygs gene and non-lens expression of [gamma]S-crystallin. Mol Vis 1998; 4:8 <http://www.molvis.org/molvis/v4/a8/>.
30. Wistow G, Sardarian L, Gan W, Wyatt MK. The human gene for gammaS-crystallin: alternative transcripts and expressed sequences from the first intron. Mol Vis 2000; 6:79-84 <http://www.molvis.org/molvis/v6/a11/>.
31. Lubsen NH, Aarts HJ, Schoenmakers JG. The evolution of lenticular proteins: the beta- and gamma-crystallin super gene family. Prog Biophys Mol Biol 1988; 51:47-76.
32. Brakenhoff RH, Aarts HJ, Reek FH, Lubsen NH, Schoenmakers JG. Human gamma-crystallin genes. A gene family on its way to extinction. J Mol Biol 1990; 216:519-32.
33. Siezen RJ, Wu E, Kaplan ED, Thomson JA, Benedek GB. Rat lens gamma-crystallins. Characterization of the six gene products and their spatial and temporal distribution resulting from differential synthesis. J Mol Biol 1988; 199:475-90.
34. Georgatos SD, Gounari F, Remington S. The beaded intermediate filaments and their potential functions in eye lens. Bioessays 1994; 16:413-8.
35. Yahara I, Aizawa H, Moriyama K, Iida K, Yonezawa N, Nishida E, Hatanaka H, Inagaki F. A role of cofilin/destrin in reorganization of actin cytoskeleton in response to stresses and cell stimuli. Cell Struct Funct 1996; 21:421-4.
36. Pfister KK. Cytoplasmic dynein and microtubule transport in the axon: the action connection. Mol Neurobiol 1999; 20:81-91.
37. Bassnett S, Missey H, Vucemilo I. Molecular architecture of the lens fiber cell basal membrane complex. J Cell Sci 1999; 112:2155-65.
38. Vasioukhin V, Fuchs E. Actin dynamics and cell-cell adhesion in epithelia. Curr Opin Cell Biol 2001; 13:76-84.
39. Sirotkin H, O'Donnell H, DasGupta R, Halford S, St Jore B, Puech A, Parimoo S, Morrow B, Skoultchi A, Weissman SM, Scambler P, Kucherlapati R. Identification of a new human catenin gene family member (ARVCF) from the region deleted in velo-cardio-facial syndrome. Genomics 1997; 41:75-83.
40. Zhurinsky J, Shtutman M, Ben-Ze'ev A. Plakoglobin and beta-catenin: protein interactions, regulation and biological roles. J Cell Sci 2000; 113:3127-39.
41. Varadaraj K, Kushmerick C, Baldo GJ, Bassnett S, Shiels A, Mathias RT. The role of MIP in lens fiber cell membrane transport. J Membr Biol 1999; 170:191-203.
42. Patil RV, Saito I, Yang X, Wax MB. Expression of aquaporins in the rat ocular tissue. Exp Eye Res 1997; 64:203-9.
43. Berry V, Mackay D, Khaliq S, Francis PJ, Hameed A, Anwar K, Mehdi SQ, Newbold RJ, Ionides A, Shiels A, Moore T, Bhattacharya SS. Connexin 50 mutation in a family with congenital "zonular nuclear" pulverulent cataract of Pakistani origin. Hum Genet 1999; 105:168-70.
44. Mulders JW, Voorter CE, Lamers C, de Haard-Hoekman WA, Montecucco C, van de Ven WJ, Bloemendal H, de Jong WW. MP17, a fiber-specific intrinsic membrane protein from mammalian eye lens. Curr Eye Res 1988; 7:207-19.
45. Gutekunst KA, Rao GN, Church RL. Molecular cloning and complete nucleotide sequence of the cDNA encoding a bovine lens intrinsic membrane protein (MP19). Curr Eye Res 1990; 9:955-61.
46. Rao GN, Gutekunst KA, Church RL. Bovine lens 23, 21 and 19 kDa intrinsic membrane proteins have an identical amino-terminal amino acid sequence. FEBS Lett 1989; 250:483-6.
47. Subramanian G, Takemoto L. Age-dependent covalent changes in MP18 from bovine lens membrane. Invest Ophthalmol Vis Sci 1991; 32:2588-92.
48. Kumar NM, Jarvis LJ, Tenbroek E, Louis CF. Cloning and expression of a major rat lens membrane protein, MP20. Exp Eye Res 1993: 56:35-43.
49. Maecker HT, Todd SC, Levy S. The tetraspanin superfamily: molecular facilitators. FASEB J 1997; 11:428-42.
50. Steele EC Jr, Kerscher S, Lyon MF, Glenister PH, Favor J, Wang J, Church RL. Identification of a mutation in the MP19 gene, Lim2, in the cataractous mouse mutant To3. Mol Vis 1997; 3:5 <http://www.molvis.org/molvis/v3/a5/>.
51. Steele EC Jr, Wang JH, Saperstein DA, Li X, Church RL. Lim2(To3) transgenic mice establish a causative relationship between the mutation identified in the lim2 gene and cataractogeneis in the To3 muse mutant. Mol Vis 2000; 6:85-94 <http://www.molvis.org/molvis/v6/a12/>.
52. Rafferty NS. Lens morphology. In: Maisel H, editor. The Ocular lens: structure, function and pathology. New York: Dekker; 1985. p. 1-60.
53. Church RL, Wang JH. The human lens fiber-cell intrinsic membrane protein MP19 gene: isolation and sequence analysis. Curr Eye Res 1993; 12:1057-65.
54. Wistow GJ, Mulders JW, de Jong WW. The enzyme lactate dehydrogenase as a structural protein in avian and crocodilian lenses. Nature 1987; 326:622-4.
55. Wistow G, Piatigorsky J. Recruitment of enzymes as lens structural proteins. Science 1987; 236:1554-6.
56. Wistow G. Lens crystallins: gene recruitment and evolutionary dynamism. Trends Biochem Sci 1993; 18:301-6.
57. Wistow GJ. Molecular biology and evolution of crystallins: gene recruitment and multifunctional proteins in the eye lens. Austin (TX): R.G. Landes; 1995.
58. Spector A. Oxidative stress-induced cataract: mechanism of action. FASEB J 1995; 9:1173-82.
59. Taylor HR, West SK, Rosenthal FS, Munoz B, Newland HS, Abbey H, Emmett EA. Effect of ultraviolet radiation on cataract formation. N Engl J Med 1988; 319:1429-33.
60. Shearer TR, David LL, Anderson RS, Azuma M. Review of selenite cataract. Curr Eye Res 1992; 11:357-69.
61. Wistow G, Bernstein SL, Ray S, Wyatt MK, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of adult human iris for the NEIBank Project: Steroid-response factors and similarities with retinal pigment epithelium. Mol Vis 2002; 8:185-95 <http://www.molvis.org/molvis/v8/a25/>.
62. Gerashchenko DY, Beuckmann CT, Marcheselli VL, Gordon WC, Kanaoka Y, Eguchi N, Urade Y, Hayaishi O, Bazan NG. Localization of lipocalin-type prostaglandin D synthase (beta-trace) in iris, ciliary body, and eye fluids. Invest Ophthalmol Vis Sci 1998; 39:198-203.
63. Goh Y, Nakajima M, Azuma I, Hayaishi O. Prostaglandin D2 reduces intraocular pressure. Br J Ophthalmol 1988; 72:461-4.
64. Kumada Y, Benson DR, Hillemann D, Hosted TJ, Rochefort DA, Thompson CJ, Wohlleben W, Tateno Y. Evolution of the glutamine synthetase gene, one of the oldest existing and functioning genes. Proc Natl Acad Sci U S A 1993; 90:3009-13.
65. Turner SL, Young JP. The glutamine synthetases of rhizobia: phylogenetics and evolutionary implications. Mol Biol Evol 2000; 17:309-19.
66. Gribskov M, Luthy R, Eisenberg D. Profile analysis. Methods Enzymol 1990; 183:146-59.
67. Dharmaraj S, Li Y, Robitaille JM, Silva E, Zhu D, Mitchell TN, Maltby LP, Baffoe-Bonnie AB, Maumenee IH. A novel locus for Leber congenital amaurosis maps to chromosome 6q. Am J Hum Genet 2000; 66:319-26.
68. Giblin FJ. Glutathione: a vital lens antioxidant. J Ocul Pharmacol Ther 2000; 16:121-35.
69. Hayes JD, Strange RC. Potential contribution of the glutathione S-transferase supergene family to resistance to oxidative stress. Free Radic Res 1995; 22:193-207.
70. Lee KK, Murakawa M, Takahashi S, Tsubuki S, Kawashima S, Sakamaki K, Yonehara S. Purification, molecular cloning, and characterization of TRP32, a novel thioredoxin-related mammalian protein of 32 kDa. J Biol Chem 1998; 273:19160-6.
71. Arner ES, Holmgren A. Physiological functions of thioredoxin and thioredoxin reductase. Eur J Biochem 2000; 267:6102-9.
72. Butterfield LH, Merino A, Golub SH, Shau H. From cytoprotection to tumor suppression: the multifactorial role of peroxiredoxins. Antioxid Redox Signal 1999; 1:385-402.
73. Sandberg M, Hassett C, Adman ET, Meijer J, Omiecinski CJ. Identification and functional characterization of human soluble epoxide hydrolase genetic polymorphisms. J Biol Chem 2000; 275:28873-81.
74. Rutter J, Reick M, Wu LC, McKnight SL. Regulation of clock and NPAS2 DNA binding by the redox state of NAD cofactors. Science 2001; 293:510-4.
75. Reick M, Garcia JA, Dudley C, McKnight SL. NPAS2: an analog of clock operative in the mammalian forebrain. Science 2001; 293:506-9.
76. Schey KL, Fowler JG, Shearer TR, David L. Modifications to rat lens major intrinsic protein in selenite-induced cataract. Invest Ophthalmol Vis Sci 1999; 40:657-67.
77. Azuma M, Fukiage C, David LL, Shearer TR. Activation of calpain in lens: a review and proposed mechanism. Exp Eye Res 1997; 64:529-38.
78. Turk B, Turk V, Turk D. Structural and functional aspects of papain-like cysteine proteinases and their protein inhibitors. Biol Chem 1997; 378:141-50.
79. Shang F, Nowell TR Jr, Taylor A. Removal of oxidatively damaged proteins from lens cells by the ubiquitin-proteasome pathway. Exp Eye Res 2001; 73:229-38.
80. Pickart CM. Targeting of substrates to the 26S proteasome. FASEB J 1997; 11:1055-66.
81. Wistow G. Cold shock and DNA binding. Nature 1990; 344:823-4.
82. Landsman D. RNP-1, an RNA-binding motif is conserved in the DNA-binding cold shock domain. Nucleic Acids Res 1992; 20:2861-4.
83. Groblewski GE, Yoshida M, Bragado MJ, Ernst SA, Leykam J, Williams JA. Purification and characterization of a novel physiological substrate for calcineurin in mammalian cells. J Biol Chem 1998; 273:22738-44.
84. Piatigorsky J, Zelenka PS. Transcriptional regulation of crystallin genes: cis elements, trans-factors and signal transduction systems in the lens. In: Wassarman P, editor. Advances in developmental biochemistry, vol 1. Greenwich (CT): JAI Press; 1992. p. 211-56.
85. Rampalli AM, Zelenka PS. Insulin regulates expression of c-fos and c-jun and suppresses apoptosis of lens epithelial cells. Cell Growth Differ 1995; 6:945-53.
86. Somasundaram T, Bhat SP. Canonical heat shock element in the alpha B-crystallin gene shows tissue-specific and developmentally controlled interactions with heat shock factor. J Biol Chem 2000; 275:17154-9.
87. Thomas MJ, Seto E. Unlocking the mechanisms of transcription factor YY1: are chromatin modifying enzymes the key? Gene 1999; 236:197-208.
88. Oliver G, Gruss P. Current views on eye development. Trends Neurosci 1997; 20:415-21.
89. Kim JI, Li T, Ho IC, Grusby MJ, Glimcher LH. Requirement for the c-Maf transcription factor in crystallin gene regulation and lens development. Proc Natl Acad Sci U S A 1999; 96:3781-5.
90. Kawauchi S, Takahashi S, Nakajima O, Ogino H, Morita M, Nishizawa M, Yasuda K, Yamamoto M. Regulation of lens fiber cell differentiation by transcription factor c-Maf. J Biol Chem 1999; 274:19254-60.
91. Ring BZ, Cordes SP, Overbeek PA, Barsh GS. Regulation of mouse lens fiber cell development and differentiation by the Maf gene. Development 2000; 127:307-17.
92. Ogino H, Yasuda K. Induction of lens differentiation by activation of a bZIP transcription factor, L-Maf. Science 1998; 280:115-8.
93. Sharon-Friling R, Richardson J, Sperbeck S, Lee D, Rauchman M, Maas R, Swaroop A, Wistow G. Lens-specific gene recruitment of zeta-crystallin through Pax6, Nrl/Maf and brain suppressor sites. Mol Cell Biol 1998; 18:2067-76.
94. Igarashi K, Hoshino H, Muto A, Suwabe N, Nishikawa S, Nakauchi H, Yamamoto M. Multivalent DNA binding complex generated by small Maf and Bach1 as a possible biochemical basis for beta-globin locus control region complex. J Biol Chem 1998; 273:11783-90.
95. Gehring WJ, Ikeo K. Pax 6: mastering eye morphogenesis and eye evolution. Trends Genet 1999; 15:371-7.
96. Kamachi Y, Sockanathan S, Liu Q, Breitman M, Lovell-Badge R, Kondoh H. Involvement of SOX proteins in lens-specific activation of crystallin genes. EMBO J 1995; 14:3510-9.
97. Blixt A, Mahlapuu M, Aitola M, Pelto-Huikko M, Enerback S, Carlsson P. A forkhead gene, FoxE3, is essential for lens epithelial proliferation and closure of the lens vesicle. Genes Dev 2000; 14:245-54.
98. Wang W, Lo P, Frasch M, Lufkin T. Hmx: an evolutionary conserved homeobox gene family expressed in the developing nervous system in mice and Drosophila. Mech Dev 2000; 99:123-37.
99. Semina EV, Ferrell RE, Mintz-Hittner HA, Bitoun P, Alward WL, Reiter RS, Funkhauser C, Daack-Hirsch S, Murray JC. A novel homeobox gene PITX3 is mutated in families with autosomal-dominant cataracts and ASMD. Nat Genet 1998; 19:167-70.
100. Jay P, Sahly I, Goze C, Taviaux S, Poulat F, Couly G, Abitbol M, Berta P. SOX22 is a new member of the SOX gene family, mainly expressed in human nervous tissue. Hum Mol Genet 1997; 6:1069-77.
101. Kasimiotis H, Myers MA, Argentaro A, Mertin S, Fida S, Ferraro T, Olsson J, Rowley MJ, Harley VR. Sex-determining region Y-related protein SOX13 is a diabetes autoantigen expressed in pancreatic islets. Diabetes 2000; 49:555-61.
102. Turque N, Plaza S, Radvanyi F, Carriere C, Saule S. Pax-QNR/Pax-6, a paired box- and homeobox-containing gene expressed in neurons, is also expressed in pancreatic endocrine cells. Mol Endocrinol 1994; 8:929-38.
103. Lubkin VL. Steroid cataract--a review and a conclusion. J Asthma Res 1977; 14:55-9.
104. Tripathi BJ, Tripathi RC, Livingston AM, Borisuth NS. The role of growth factors in the embryogenesis and differentiation of the eye. Am J Anat 1991; 192:442-71.
105. McAvoy JW, Chamberlain CG, de Iongh RU, Hales AM, Lovicu FJ. Lens development. Eye 1999; 13:425-37.
106. Lee EH, Joo CK. Role of transforming growth factor-beta in transdifferentiation and fibrosis of lens epithelial cells. Invest Ophthalmol Vis Sci 1999; 40:2025-32.
107. Wawersik S, Purcell P, Rauchman M, Dudley AT, Robertson EJ, Maas R. BMP7 acts in murine lens placode development. Dev Biol 1999; 207:176-88.
108. Furuta Y, Hogan BL. BMP4 is essential for lens induction in the mouse embryo. Genes Dev 1998; 12:3764-75.
109. Lang RA. Which factors stimulate lens fiber cell differentiation in vivo? Invest Ophthalmol Vis Sci 1999; 40:3075-8.
110. Wormstone IM, Tamiya S, Marcantonio JM, Reddan JR. Hepatocyte growth factor function and c-Met expression in human lens epithelial cells. Invest Ophthalmol Vis Sci 2000; 41:4216-22.
111. Klambt C. EGF receptor signalling: the importance of presentation. Curr Biol 2000; 10:R388-91.
112. Kasof GM, Gomes BC. Livin, a novel inhibitor of apoptosis protein family member. J Biol Chem 2001; 276:3238-46.
113. Lin JH, Deng G, Huang Q, Morser J. KIAP, a novel member of the inhibitor of apoptosis protein family. Biochem Biophys Res Commun 2000; 279:820-31.
114. Yang YL, Li XM. The IAP family: endogenous caspase inhibitors with multiple biological activities. Cell Res 2000; 10:169-77.