Wistow, Mol Vis 2002; 8:196-204.

Molecular Vision 2002; 8:196-204 <http://www.molvis.org/molvis/v8/a26/>
Received 31 August 2001 | Accepted 14 December 2001 | Published 15 June 2002 Download
Reprint

Expressed sequence tag analysis of human retina for the NEIBank Project: Retbindin, an abundant, novel retinal cDNA and alternative splicing of other retina-preferred gene transcripts

Graeme Wistow,¹ Steven L. Bernstein,² M. Keith Wyatt,¹ Sugata Ray,¹ Amita Behal,¹ Jeffrey W. Touchman,³ Gerard Bouffard,³ Don Smith,¹ Katherine Peterson¹

¹Section on Molecular Structure and Function, National Eye Institute, National Institutes of Health, Bethesda, MD; ²Departments of Ophthalmology and Neurobiology & Genetics, University of Maryland School of Medicine, Baltimore, MD; ³NIH Intramural Sequencing Center, Gaithersburg, MD

Correspondence to: Graeme Wistow, Ph.D., Chief, Section on Molecular Structure and Function, National Eye Institute, Building 6, Room 331,National Institutes of Health, Bethesda, MD, 20892-2740; Phone: (301) 402-3452; FAX: (301) 496-0078; email: graeme@helix.nih.gov

Abstract

Purpose: Expressed sequence tag (EST) analysis was performed on un-normalized, unamplified cDNA libraries constructed from adult human retina to examine the expression profile of the tissue and to contribute resources for functional genomics studies.

Methods: Two size fractionated cDNA libraries (designated hd and he) were constructed from human retina RNA. Clones were randomly selected for sequencing and analyzed using the bioinformatics program GRIST (GRouping and Identification of Sequence Tags). PCR, Northern blotting and other techniques have been used to examine selected novel transcripts.

Results: After informatics analysis, 2200 retina cDNAs yield 1254 unique clusters, potentially representing individual genes. Opsin is the most abundant transcript and other retina transcripts are prominently represented. One abundant cluster of cDNAs encodes retbindin, a novel, retina preferred transcript which has sequence similarity to riboflavin binding proteins and whose gene is on chromosome 19. Variant transcripts of known retina genes are also observed, including an alternative exon in the coding sequence of the transcription factor NRL and a skipped coding sequence exon in the phosphodiesterase γsubunit (PDE6G).

Conclusions: The new retina cDNA libraries compare favorably in quality with those already represented in public databases. They are rich in retina specific sequences and include abundant cDNAs for a novel protein, retbindin. The function of retbindin remains to be determined, but it is a candidate for flavinoid or carotenoid binding. Analysis of multiple clones for highly expressed retina genes reveals several alternative splice variants in both coding and noncoding sequences which may have functional significance. The validated set of retina cDNAs will contribute to a nonredundant set for microarray construction.

Introduction

The retina is the tissue of the eye where light begins its transformation into sight. Specialized photoreceptor cells transduce photons into nerve impulses that are organized through complex layers of nerve cells and transmitted on to the perception centers of the brain. Not surprisingly, the retina has a large complement of tissue specific proteins and mutations in the genes for many of these proteins have been identified in a wide variety of inherited eye defects [1,2]. The retina also has much in common with other neural tissues, particularly those related to hearing, so that conditions like Usher's syndrome affect both hearing and sight through mutations in genes common to ear and eye [3-7].

Expressed sequence tag (EST) analysis of retina has already produced thousands of sequences in dbEST and these have been instrumental in identifying several disease loci. An important example of this was the discovery of the gene that is defective in Stargardt's disease, an inherited macular degeneration [8]. In addition to their value in gene discovery, ESTs are an essential resource for functional genomics studies, particularly for the construction of cDNA microarrays to compare gene expression levels in disease, aging and development [9].

Although many retinal ESTs have been sequenced (see compilation at Unigene), not all are available as DNA for further studies. The public databases also contain many sequences of poor quality or of contaminant origin. In addition, many of the libraries that have been used for EST analysis are amplified or subtracted [10]. These manipulations have the potential to reduce the content of certain clones, such as those that grow poorly in bacterial hosts. Most EST projects also concentrate on 3' reads. Since this biases data towards 3' untranslated regions (UTR) where introns are very uncommon, this reduces the chances of observing novel splice forms that would give rise to variant proteins with structural and functional differences.

For these reasons and to obtain clones for construction of a human eye cDNA microarray, we constructed unamplified cDNA libraries from dissected, post mortem human retina as part of the NEIBank project [11]. Here we describe initial analysis of over 2200 5' sequence reads from this effort. The data reveal novel, or previously unidentified, genes and numerous splice variants with potential functional significance. All sequences are available through NEIBank.

Methods

Tissue and RNA preparation

Human post-mortem eye tissues were obtained under University of Maryland School of Medicine IRB exemptions SB-019701 and SB-129901. Neural retina tissue was dissected from two 80 year old white male donors with no observed eye disease. Total RNA was extracted using RNAzol (Tel-Test Inc., Friendswood, TX). 100 μg of total RNA was used for library construction. PolyA RNA was prepared using an oligo(dT) cellulose affinity column [12].

cDNA Library Construction

A directionally cloned cDNA library in the pSPORT1 vector (Life Technologies, Rockville MD) was constructed at Bioserve Biotechnology (Laurel, MD). General details of library construction for NEIBank cDNA libraries are described elsewhere [13]. For this library, cDNA inserts were cloned into the NotI/SalI sites of the vector. In this case, as an additional measure to remove small contaminant fragments, the cDNA was run over a Sephacryl S-500 HR resin column designed to fractionate cDNA >500 bp (Life Technologies). The columns were run in TEN buffer: 10 mM Tris-HCL (pH 7.5), 0.1 mM EDTA,25 mM NaCl. Sub-libraries (designated hd and he for sequencing purposes) were made from the first two 35 μl fractions containing cDNA.

cDNA Sequencing and Bioinformatics

Methods for sequencing and bioinformatics analysis are described in detail elsewhere [14]. Briefly, randomly picked clones were sequenced at the NIH Intramural Sequencing Center (NISC). Clones were sequenced only from the 5' end. A specially developed software tool, GRIST (GRouping and Identification of Sequence Tags) was used to analyze the data and assemble the results in web page format. Clusters of sequences were aligned using SeqMan II (DNAstar, Madison, WI) to examine alternative transcripts. Sequences were also searched through genome resources at the National Center for Biotechnology Information (NCBI), the Human Genome Project, and the Celera Genomics Group. Protein motifs were searched using GenomeNet, the NCBI, and the Swiss Institute of Bioinformatics.

Polymerase Chain Reaction (PCR) Methods

PCR was used to validate alternative splice forms, to obtain probes for hybridization, and to complete sequences. For template, a sample of the complete cDNA library representing at least one million primary clones was amplified and plasmid was isolated using reagents from QIAGEN (Valencia, CA). Fragments were amplified using either Taq (Roche, Indianapolis, IN) or Elongase (Life Technologies, Gaithersburg, MD) polymerase systems and following manufacturer's protocols.

Northern Blot

Multi-tissue northerns were purchased from Clontech (Palo Alto CA). Northerns for rhesus monkey (Macaca mulatta) eye tissues were prepared as described previously [15]. A cDNA for human retbindin protein was identified from the retina library. The insert was excised and labelled using a prime-it II kit (Stratagene systems, La Jolla, CA) and ³²P-labelled dCTP

Northern blots were prehybridized in Hybrisol II (Oncor, Gaithersburg, MD) for 4 h, followed by hybridization with the specific radiolabelled cDNA probe at 63 °C for 18 h. After hybridization, membranes were washed in 0.2X SSC, 0.1% SDS at 63 °C and exposed to Kodax XAR or BMR photographic film for varying lengths of time at -70 °C.

Results & Discussion

Library Statistics

During library synthesis, the cDNA was broadly fractionated into two size ranges and subcloned into libraries designated hd and he for sequencing purposes. As in other NEIBank libraries, all clones are numbered according to their library designation and their position in 96 well plates, e.g. hd01a01. For hd, there were 6.1x10⁶ primary transcripts with an average insert size of 1 kbp. 4% of clones contained no insert and 5% contained mitochondrial genome sequence. For he, there were 2.4x10⁶ primary transcripts with an average insert size of 1.5 kbp. 3% of clones contained no insert and 5% contained mitochondrial genome sequence. Both libraries contained 2% rRNA. Equal numbers of clones were sequenced from both libraries and the data were analyzed together.

Clones

A total of 2206 quality 5' reads from the combined hd and he libraries yielded 1884 clones after removal of contaminants and very short sequences and masking of repetitive sequences. Analysis of these clones through GRIST [16] resulted in 1255 GRIST clusters of which 85% contain single clones. Each cluster potentially represents a unique gene expressed in retina. About 20% of the clusters have no match in Unigene, although this number varies as Unigene evolves. A large proportion of all the clones have no known function, even though other ESTs in dbEST are present.

Prior to this effort, we examined the available data for human retina ESTs in GenBank. Approximately 30,000 retina ESTs (primarily from five libraries: Soares retina N2b4HR, Soares retina N2b5HR, human retina randomly primed sub-library12b7, Stratagene retina 937292, and Retina I) were downloaded from the dbEST section of genbank and analyzed through GRIST. Table 1 shows a list of the most abundant clusters from this analysis. GAPDH, EF1α, and collagen 1α1 are the most common cDNA clones. Contaminants, including mitochondrial genome, vector (principally λphage), and bacterial sequences, are also highly represented, some ranking among the most abundant sequences in the retina section of dbEST (Table 1). Indeed, at this time the human section of Unigene contained a cluster (which has since been removed) that contained sequences from E. coli enolase, all derived from a human retina library. There also were a large number of short sequences with apparent sequencing or PCR derived errors. In contrast, the hd/he data have a very low content of contaminants and an average length of quality sequence reads of over 500 bp. The most abundant cDNA is opsin, at almost 3% of the total (Table 2). The sequence collection also contains a good representation of other retina markers, as indicated in Table 3.

After opsin, the second most abundant cDNA in the hd/he collection is the protective enzyme glutathione peroxidase 3, accounting for almost 2% of the total sequenced. This enzyme has recently been suggested as a candidate for involvement in age related macular degeneration, based on a genome wide screen for susceptibility loci [17] and is also one of the most abundant transcripts in the NEIBank human iris collection [18]. Other highly abundant retina transcripts in the hd/he sequence collection include rod phosphodiesterase γ(PDE6G), S-antigen (arrestin), α-transducin, the transcription factor NRL, ROM1, recoverin and UNC119 (Table 2). The list of cDNAs represented at least seven times also includes a cluster of ESTs for a novel retina transcript.

Retbindin: An abundant, novel transcript

The hd/he sequence collection contains a cluster of eight ESTs from a transcript with no significant match to named GenBank sequences. Completion of the retina sequences (Figure 1A) produced a 1043 bp cDNA sequence, which has been submitted to GenBank (GenBank Accession: AY028917). Search of dbEST revealed a group of ESTs, collected in Unigene cluster (Hs.21162), that come mainly from retina and brain and correspond to the 3' untranslated region (UTR) of this gene. The gene encoding this transcript can now be identified in human genome sequence, where it is located on chromosome 19p13. Five essentially full length cDNA clones from our retina collection define a compact gene of about 5 kbp divided into 6 exons, the first of which is noncoding (Figure 1A,B). The sixth full length cDNA from retina has a different 5' end that defines a noncoding exon 200 bp upstream of the first exon shared by the other clones. Both these exons splice to the same second exon that contains the start site of the coding sequence (Figure 1B). This could indicate either the presence of two promoters for the gene or of a single promoter and alternative splicing in the 5' UTR. Either option would produce variant 5' UTRs. As this paper was in preparation, a new unidentified sequence from the same gene was deposited in GenBank (Mammalian Gene Catalog BC005063). This sequence uses two 5' exons that are different from those used in retina (not shown), suggesting that the gene does indeed have multiple promoters.

The novel retina sequence contains an open reading frame (ORF) of 229 amino acids (Figure 1A) that predicts a protein of 24,614 Da size and isoelectric point (pI) of 8.2. Comparison of this ORF with protein sequences in GenBank reveals a weak but significant match (27% identity over 135 residues) with riboflavin binding proteins, which are secreted, glycosylated proteins of various egg laying vertebrates (Figure 2), and also more distant similarities with some mammalian folate receptors. This suggests that the protein belongs to a superfamily involved in binding cyclic or polyunsaturated molecules. As a result, the name retbindin was chosen. The predicted protein sequence of retbindin contains 11 cysteines that are conserved in riboflavin binding proteins in which most are involved in intramolecular disulfide bonds. The sequence also contains conserved glycosylation sites, suggesting that the protein is likely to be secreted and post-translationally modified.

Northern blots of rhesus monkey retina, brain, liver, adrenal, ovary, heart, muscle, and lung probed with cDNA for retbindin show a major band of approximately 1.2 knt only in retina (Figure 3). Thus, expression of retbindin appears to be highly retina preferred. While the function of retbindin remains to be determined, potential binding ligands include retinoids and other carotenoids (such as lutein) that have important roles in retina [19-21]. Flavinoids are another possibility. The role of riboflavin in the retina is usually thought of in negative terms, as a target for UV absorption and subsequent generation of damaging free radicals [22,23]. By transporting or sequestering flavinoids, retbindin could conceivably protect the retina from such insult.

Alternative transcripts of retina expressed genes

As estimates of the number of genes in the human genome have fallen, interest in alternative splicing and alternative transcripts as sources of increased diversity has increased [24]. EST analysis of an un-normalized library naturally produces multiple cDNAs for abundantly expressed genes. Particularly for 5' reads that cover coding sequence and exon/intron boundaries, these ESTs can reveal alternative splice forms that may alter protein products. Alternative splicing or use of alternative promoters or polyadenylation signals may also exert post-transcriptional regulatory effects through changes to UTRs. Several genes have sequences in their 5' or 3' UTRs that affect expression through translational control or mRNA stability, although in most cases the mechanisms involved are not well understood [25,26]. In addition to these potentially functional classes, other variant cDNAs, particularly those derived from introns and lacking ORFs, may represent failures of the splicing process. Such errors could have their own biological significance. It has been speculated that age related failures in normal transcript splicing, particularly for highly expressed genes, could contribute to a decline in cellular function with age in eye tissues [27]. For the hd/he data, several retina preferred transcripts are abundant enough to allow examination for variant forms.

Opsin

All reads for opsin transcripts in the hd/he collection are from the 5' end and no systematic effort was made to map alternative 3' ends. Nevertheless, alignment of the 52 ESTs for opsin from hd/he shows evidence for three alternative sites for polyadenylation. Relative to a 2768 bp full length reference cDNA sequence in GenBank (GenBank Accession number XM_003284), these cDNAs truncate the long 3' UTR of opsin at positions 1736, 1818 and 1835 bp. Other alternative 3' ends for opsin may also be present in other ESTs that extend further 3' than these sites and have not been sequenced from the 3' end. Nevetheless, this shows that this major retina transcript exhibits considerable polymorphism in the length of its 3' UTR. Our data also reveal another type of polymorphism in opsin. The 5' UTR sequences of the collected opsin cDNAs reveal potential paired single nucleotide polymorphisms (SNP); G->A at position 45 and A->G at position 59, relative to the full length reference sequence. Both G45/A59 and A45/G59 pairs are nearly equally represented (10 and 9) in the group of 19 clones whose 5' ends cover most of the opsin 5' UTR. SNPs can be used as markers in linkage analysis of disease and a collection of SNPs is currently being assembled by NCBI.

The rod cell version of opsin is, as expected, the most abundant of the opsins expressed in retina, however our retina EST collection also contains a single cDNA for the long wave (red) sensitive cone pigment (he07e02) (Table 3).

Phosphodiesterase 6 γ

Phosphodiesterase 6 γ(PDE6G) is the inhibitory subunit of the cyclic GMP phosphodiesterase (PDE) of the rod photoreceptors. Mice that lack PDE6G through gene knockout lack PDE activity and undergo retinal degeneration [28]. In terms of cDNA abundance, PDE6G is ranked fourth in abundance in the hd/he collection, represented by 25 ESTs. Of these, 14 are long enough to cover exon 1 of the gene (GenBank Accession: X62025) and of these three (hd01c02, he01d06 and he05h12) exhibit an alternative splice that skips exon 2, the location of the ORF start site (Figure 4), thereby removing a large part of the coding sequence. As shown in Figure 4, the alternative splice form of PDE6G contains a truncated ORF that starts with an ATG in an acceptable consensus. This would produce a shorter version of the phosphodiesterase subunit, corresponding to the last 31 amino acids. This structure could be functionally significant. Recently, the x-ray structure of a complex including PDE6G and α-transducin has been described [29]. From this model, it is apparent that the exon skipped version of PDE6G transcript would produce a protein domain including the motif of three α-helices that is involved in complex formation with α-transducin and interaction with the regulatory switch II region that undergoes conformation change upon GTP hydrolysis [29]. The predicted short form of PDE6G would be able to bind α-transducin in a nucleotide dependent manner. The short form would lack the highly charged N-terminal region of PDE6G that seems to be involved in binding to the α and βsubunits of PDE [30]. The functional consequences of this truncation will need to be examined to determine whether this variant form accounts for a normal part of PDE6G activity or whether it represents a possible age related defect in transcript processing.

Two other clones, (he11e03 and he11c04) originate respectively from intron 1 and intron 2 of the PDE6G gene and probably could not give rise to functional proteins. It seems most likely that these clones result from splicing errors. Overall, a significant fraction of PDE6G transcripts differ from the canonical form.

Again, the rod version of PDE γsubunit is represented most abundantly, but the hd/he collection also includes one copy of the cone cell version (PDE6H) a clone that appears to be full length (Table 3).

Arrestin

Alternative splicing in the 5' UTR can be seen in the cluster of clones for arrestin or S-antigen. This component of the phototransduction cycle in rod cells [31] is represented by 13 cDNA clones. Of these, 8 are essentially full length and of these, two contain an insert sequence derived by alternative splicing. The insertion consists of 49 bp immediately 5' to exon 2 and results from use of an alternative 5' end of that exon (Figure 5).

α-Transducin and UNC-119/HRG4

In addition to potentially novel variant transcripts, the hd/he collection contains representatives of alternative splice forms that have been observed before. For example, published sequence data for α-transducin have shown the existence of two splice forms of the 3' UTR, changing the length of this region by over 1 knt [32]. Both versions are represented here. The human homolog of the C. elegans UNC-119 gene has preferred expression in photoreceptors [33]. It also ranks among the most abundant cDNAs in the hd/he collection. It has previously been reported that there are two versions of the human UNC-119 transcript, one of which results from retention of the last intron [33]. Again, both these variants are represented at similar levels in the cluster of hd/he ESTs. It will be important to include these variant forms in microarrays for future expression studies of eye tissues.

NRL

NRL is a basic-leucine zipper protein of the Maf family of transcription factors [34]. Proteins that bind maf-response elements (MARE) have important roles in tissue specific expression of genes in retina and lens [35-38]. C-Maf is essential for eye development [38-40] while mutations in the gene for NRL have been identified in the retinal degeneration RP27 [41].

The hd/he data produce a cluster of 9 ESTs for NRL, making cDNAs for this transcription factor among the most abundant in the retina collection (Table 2). When the sequences are aligned, it is apparent that one clone (hd09f04) contains a region that does not match the overall consensus for this gene. The novel sequence can be found in the gene for NRL [42], where it is positioned between exons 2 and 3 (Figure 6A). A search of dbEST finds one other EST (GenBank Accession number BE257768) from a retinoblastoma derived library (Mammalian Gene Catalog MGC:16) that, although of poor sequence quality, clearly contains a shorter version of the same insert sequence.

In the gene, the insert sequence is flanked at the 3' end by a consensus splice junction (Figure 6A) and is clearly an alternative exon of the NRL gene. The available sequence for the insert contains an ORF that is in frame with exon 3 derived sequence. The result of the inclusion of this alternative exon would be the insertion of an additional, glycine rich basic region N-terminal to the conserved basic-leucine zipper domain of the transcription factor. Such a sequence feature is intriguingly reminiscent of the long glycine-rich region in the same relative position of the related transcription factor c-Maf (GenBank accession number AAC27039) (Figure 6B). Figure 6A shows possible in frame splice junctions upstream of the insert exon and the further 5' limit of the exon, defined by an in frame stop codon. Unfortunately, the 5' end of the alternative NRL exon is not yet experimentally determined. Various PCR methods have been used in attempts to extend the variant sequence, but so far no additional cDNA sequence has been obtained. The insert exon is over 70% G/C in base composition and there may be a significant block to reverse transcription in this region. It is also possible that the novel exon represents an alternative first exon for a truncated NRL. Figure 6A shows an in frame start codon that could be the start site for the ORF of such a transcript. Work in progress on additional clones obtained from the hd/he libraries lends support for this possibility. Efforts are continuing to determine the complete structure of this NRL variant.

Unidentified Clones

As illustrated by retbindin, ESTs can reveal novel gene transcripts. Many of the clones sequenced from the hd/he libraries have no significant match with named genes in GenBank or ESTs in dbEST. While in some cases this is simply due to lengths that are too short or to poor sequence quality that fails to meet the stringent criteria used in BLAST and GRIST, many of these "unidentified" clones represent authentic novel genes or genes that are known only from anonymous 3' ESTs. The 5' reads from the present data have a greater chance of identifying protein coding sequence and exon/intron structure in genome data. One such example is clone he05g01. This clone matches no other ESTs or GenBank entries. However, BLASTX of the sequence reveals a significant match to calcium binding EF hand proteins [43], including calcium binding transporters [44]. The ORF in the clone contains 3 consensus EF hand motifs (Figure 7). BLASTN against Human Genome sequence shows that the cDNA obtained corresponds to 4 exons in the NCBI draft sequence of human chromosome 9. This clone is just one of a number of apparently novel genes that need further analysis.

Conclusions

Although the dbEST section of GenBank contains many clones derived from human retina, it was important for us to include this key ocular tissue in our survey of gene expression in the human eye. Examination of existing data suggested that many of the clones that have been described previously have problems of poor sequence quality and short clone length. Furthermore, many are not available as DNA resources for future functional genomics investigations. The clones identified through this analysis will be added to the nonredundant set being used to create human eye expressed transcript microarrays for functional studies. The library is also a good resource for full length cDNA for individual genes. Even for transcripts that have not yet emerged from the EST sequencing, we have been able to obtain complete coding sequences by PCR amplification of DNA template derived from the library (unpublished).

The new unamplified, un-normalized adult human retina cDNA libraries described in this manuscript are of excellent quality, as judged by clone length, transcript representation, and sequence read length. This is most strikingly demonstrated by the identification of a novel retina expressed gene, retbindin, that accounts for about 0.4% of the cDNAs sequenced in this study. Sequence comparisons show that retbindin belongs to a superfamily of riboflavin/folate binding proteins. These kinds of relationships are widely used in genomics to help identify function of novel proteins, but, as in many other cases, it is clear that this gives us only a general and preliminary view of the role of retbindin. It seems likely that retbindin is a secreted, post-translationally modified protein capable of binding cyclic or polyunsaturated small molecules. Since retbindin is expressed prominently in the retina, retinoids and carotenoids are possible candidates. Indeed there is considerable interest in identifying retinal proteins that bind carotenoids such as lutein [21]. Efforts are underway to express recombinant retbindin with the objective of screening these and other possible binding activities.

As we have pointed out for other NEIBank libraries [13,18,45], the use of an unamplified cDNA libraries seems to improve the chances of detecting certain transcripts that, for whatever reason, grow poorly in bacterial host cells, since amplification may suppress the levels of certain clones. Retbindin might have been missed in other retina libraries for such reasons. Another factor favoring gene discovery in our strategy may be our preference for 5' directed sequencing. This tends to reveal protein coding regions that show distant evolutionary relationships and immediately indicate a novel protein product.

The same sequencing strategy also provides the greatest chance of identifying novel splice variants, several of which are also apparent in our dataset. Many gene transcripts have very long 3' UTRs so that 3' directed sequence may not reach identifiable ORF. Furthermore, 3' UTRs are not usually interrupted by introns. The retina sequence collection described here shows many alternative transcripts, some of which are likely to have important functional consequences. Two notable examples are the presence of an alternatively spliced exon in the gene for the transcription factor Nrl, which could give rise to a variant with modified protein-protein interactions or DNA binding affinity, and the high frequency exon skipped variant of PDE6G, which would produce a truncated protein that could potentially alter control of the phototransduction cycle.

Coding sequence variants, such as those described above, can have obvious consequences, but variants in noncoding sequences can also be functionally significant, for example affecting RNA stability or translational efficiency [25,26]. Alternatively, some of these variants may simply be functionally neutral since it may make no difference whether a 3' UTR varies in length by a few hundred bases while others may be the result of errors in the splicing machinery of the cell. Splicing errors may not be inconsequential. Under normal conditions, mis-spliced transcripts may be dealt with without deleterious effects. However, it is conceivable that, particularly for highly expressed genes, splice errors could become sufficiently abundant to cause problems, depleting levels of functional RNA and protein and causing the accumulation of badly folded, nonfunctional proteins. For certain cell types, such as those of the neural retina that are typically not replaced over time, this could be another insult added to those that degrade physiological function with age. The data described here are derived from older donors and may therefore reveal some age related phenomena. These issues will be addressed in detail in future microarray experiments using these clone resources.

Acknowledgements

SLB is supported by the V. Kann Rasmussen Foundation (Denmark) and is a Career Development Awardee of Research to Prevent Blindness (RPB). We thank Dr. Weinu Gan for cDNA sequencing and Ray Tabios for technical assistance.

References

1. Rattner A, Sun H, Nathans J. Molecular genetics of human retinal disease. Annu Rev Genet 1999; 33:89-131.

2. Phelan JK, Bok D. A brief review of retinitis pigmentosa and the identified retinitis pigmentosa genes. Mol Vis 2000; 6:116-24 <http://www.molvis.org/molvis/v6/a16/>.

3. Eudy JD, Sumegi J. Molecular genetics of Usher syndrome. Cell Mol Life Sci 1999; 56:258-67.

4. Kimberling WJ, Orten D, Pieke-Dahl S. Genetic heterogeneity of Usher syndrome. Adv Otorhinolaryngol 2000; 56:11-8.

5. Bolz H, von Brederlow B, Ramirez A, Bryda EC, Kutsche K, Nothwang HG, Seeliger M, del C-Salcedo Cabrera M, Vila MC, Molina OP, Gal A, Kubisch C. Mutation of CDH23, encoding a new member of the cadherin gene family, causes Usher syndrome type 1D. Nat Genet 2001; 27:108-12.

6. Di Palma F, Holme RH, Bryda EC, Belyantseva IA, Pellegrino R, Kachar B, Steel KP, Noben-Trauth K. Mutations in Cdh23, encoding a new type of cadherin, cause stereocilia disorganization in waltzer, the mouse model for Usher syndrome type 1D. Nat Genet 2001; 27:103-7.

7. Bork JM, Peters LM, Riazuddin S, Bernstein SL, Ahmed ZM, Ness SL, Polomeno R, Ramesh A, Schloss M, Srisailpathy CR, Wayne S, Bellman S, Desmukh D, Ahmed Z, Khan SN, Kaloustian VM, Li XC, Lalwani A, Riazuddin S, Bitner-Glindzicz M, Nance WE, Liu XZ, Wistow G, Smith RJ, Griffith AJ, Wilcox ER, Friedman TB, Morell RJ. Usher syndrome 1D and nonsyndromic autosomal recessive deafness DFNB12 are caused by allelic mutations of the novel cadherin-like gene CDH23. Am J Hum Genet 2001; 68:26-37.

8. Allikmets R, Singh N, Sun H, Shroyer NF, Hutchinson A, Chidambaram A, Gerrard B, Baird L, Stauffer D, Peiffer A, Rattner A, Smallwood P, Li Y, Anderson KL, Lewis RA, Nathans J, Leppert M, Dean M, Lupski JR. A photoreceptor cell-specific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat Genet 1997; 15:236-46.

9. Duggan DJ, Bittner M, Chen Y, Meltzer P, Trent JM. Expression profiling using cDNA microarrays. Nat Genet 1999; 21:10-4.

10. Bonaldo MF, Lennon G, Soares MB. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 1996; 6:791-806.

11. Wistow G. A project for ocular bioinformatics: NEIBank. Mol Vis 2002; 8:161-3 <http://www.molvis.org/molvis/v8/a22/>.

12. Simms D. mRNA isolation for high quality cDNA. Focus 1995; 17:39-42.

13. Wistow G, Bernstein SL, Wyatt MK, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of adult human lens for the NEIBank Project: Over 2000 non-redundant transcripts, novel genes and splice variants. Mol Vis 2002; 8:171-84 <http://www.molvis.org/molvis/v8/a24/>.

14. Wistow G, Bernstein SL, Touchman JW, Bouffard G, Wyatt MK, Peterson K, Gao J, Buchoff P, Smith D. Grouping and identification of sequence tags (GRIST): Bioinformatics tools for the NEIBank database. Mol Vis 2002; 8:164-70 <http://www.molvis.org/molvis/v8/a23/>.

15. Bernstein SL, Wong P. Regional expression of disease-related genes in human and monkey retina. Mol Vis 1998; 4:24 <http://www.molvis.org/molvis/v4/a24/>.

16. Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8:186-94.

17. Weeks DE, Conley YP, Mah TS, Paul TO, Morse L, Ngo-Chang J, Dailey JP, Ferrell RE, Gorin MB. A full genome scan for age-related maculopathy. Hum Mol Genet 2000; 9:1329-49.

18. Wistow G, Bernstein SL, Ray S, Wyatt MK, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of adult human iris for the NEIBank Project: Steroid-response factors and similarities with retinal pigment epithelium. Mol Vis 2002; 8:185-95 <http://www.molvis.org/molvis/v8/a25/>.

19. Rando RR. Polyenes and vision. Chem Biol 1996; 3:255-62.

20. Beatty S, Koh H, Phil M, Henson D, Boulton M. The role of oxidative stress in the pathogenesis of age-related macular degeneration. Surv Ophthalmol 2000; 45:115-34.

21. Yemelyanov AY, Katz NB, Bernstein PS. Ligand-binding characterization of xanthophyll carotenoids to solubilized membrane proteins derived from human retina. Exp Eye Res 2001; 72:381-92.

22. Batey DW, Daneshgar KK, Eckhert CD. Flavin levels in the rat retina. Exp Eye Res 1992; 54:605-9.

23. Lucius R, Mentlein R, Sievers J. Riboflavin-mediated axonal degeneration of postnatal retinal ganglion cells in vitro is related to the formation of free radicals. Free Radic Biol Med 1998; 24:798-808.

24. Baltimore D. Our genome unveiled. Nature 2001; 409:814-6.

25. van der Velden AW, Thomas AA. The role of the 5' untranslated region of an mRNA in translation regulation during development. Int J Biochem Cell Biol 1999; 31:87-106.

26. de Moor CH, Richter JD. Translational control in vertebrate development. Int Rev Cytol 2001; 203:567-608.

27. Wistow G, Sardarian L, Gan W, Wyatt MK. The human gene for gammaS-crystallin: alternative transcripts and expressed sequences from the first intron. Mol Vis 2000; 6:79-84 <http://www.molvis.org/molvis/v6/a11/>.

28. Tsang SH, Gouras P, Yamashita CK, Kjeldbye H, Fisher J, Farber DB, Goff SP. Retinal degeneration in mice lacking the gamma subunit of the rod cGMP phosphodiesterase. Science 1996; 272:1026-9.

29. Slep KC, Kercher MA, He W, Cowan CW, Wensel TG, Sigler PB. Structural determinants for regulation of phosphodiesterase by a G protein at 2.0 A. Nature 2001; 409:1071-7.

30. Granovsky AE, McEntaffer R, Artemyev NO. Probing functional interfaces of rod PDE gamma-subunit using scanning fluorescent labeling. Cell Biochem Biophys 1998; 28:115-33.

31. Molday RS. Photoreceptor membrane proteins, phototransduction, and retinal degenerative diseases. The Friedenwald Lecture. Invest Ophthalmol Vis Sci 1998; 39:2491-513.

32. Fong SL. Characterization of the human rod transducin alpha-subunit gene. Nucleic Acids Res 1992; 20:2865-70.

33. Swanson DA, Chang JT, Campochiaro PA, Zack DJ, Valle D. Mammalian orthologs of C. elegans unc-119 highly expressed in photoreceptors. Invest Ophthalmol Vis Sci 1998; 39:2085-94.

34. Motohashi H, Shavit JA, Igarashi K, Yamamoto M, Engel JD. The world according to Maf. Nucleic Acids Res 1997; 25:2953-9.

35. Rehemtulla A, Warwar R, Kumar R, Ji X, Zack DJ, Swaroop A. The basic motif-leucine zipper transcription factor Nrl can positively regulate rhodopsin gene expression. Proc Natl Acad Sci U S A 1996; 93:191-5.

36. Ogino H, Yasuda K. Induction of lens differentiation by activation of a bZIP transcription factor, L-Maf. Science 1998; 280:115-8.

37. Sharon-Friling R, Richardson J, Sperbeck S, Lee D, Rauchman M, Maas R, Swaroop A, Wistow G. Lens-specific gene recruitment of zeta-crystallin through Pax6, Nrl/Maf and brain suppressor sites. Mol Cell Biol 1998; 18:2067-76.

38. Kim JI, Li T, Ho IC, Grusby MJ, Glimcher LH. Requirement for the c-Maf transcription factor in crystallin gene regulation and lens development. Proc Natl Acad Sci U S A 1999; 96:3781-5.

39. Kawauchi S, Takahashi S, Nakajima O, Ogino H, Morita M, Nishizawa M, Yasuda K, Yamamoto M. Regulation of lens fiber cell differentiation by transcription factor c-Maf. J Biol Chem 1999; 274:19254-60.

40. Ring BZ, Cordes SP, Overbeek PA, Barsh GS. Regulation of mouse lens fiber cell development and differentiation by the Maf gene. Development 2000; 127:307-17.

41. Bessant DA, Payne AM, Mitton KP, Wang QL, Swain PK, Plant C, Bird AC, Zack DJ, Swaroop A, Bhattacharya SS. A mutation in NRL is associated with autosomal dominant retinitis pigmentosa. Nat Genet 1999; 21:355-6.

42. Farjo Q, Jackson A, Pieke-Dahl S, Scott K, Kimberling WJ, Sieving PA, Richards JE, Swaroop A. Human bZIP transcription factor gene NRL: structure, genomic sequence, and fine linkage mapping at 14q11.2 and negative mutation analysis in patients with retinal degeneration. Genomics 1997; 45:395-401.

43. Yap KL, Ames JB, Swindells MB, Ikura M. Diversity of conformational states and changes within the EF-hand protein superfamily. Proteins 1999; 37:499-507.

44. del Arco A, Satrustegui J. Molecular cloning of Aralar, a new member of the mitochondrial carrier superfamily that binds calcium and is present in human muscle and brain. J Biol Chem 1998; 273:23327-34.

45. Wistow G, Bernstein SL, Wyatt MK, Fariss RN, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of human RPE/choroid for the NEIBank Project: Over 6000 non-redundant transcripts, novel genes and splice variants. Mol Vis 2002; 8:205-20 <http://www.molvis.org/molvis/v8/a27/>.

Typographical corrections

Wistow, Mol Vis 2002; 8:196-204 <http://www.molvis.org/molvis/v8/a26/>