|Molecular Vision 2004;
Received 11 December 2003 | Accepted 20 April 2004 | Published 20 April 2004
Bioinformatic approaches for identification and characterization of olfactomedin related genes with a potential role in pathogenesis of ocular disorders
Arijit Mukhopadhyay, Sangita
Talukdar, Ashima Bhattacharjee, Kunal Ray
Human Genetics & Genomics Division, Indian Institute of Chemical Biology, Kolkata, India
Correspondence to: Dr. Kunal Ray, Human Genetics & Genomics Division,
Indian Institute of Chemical Biology, 4 Raja S. C. Mullick Road,
Jadavpur, Kolkata-700 032, India; Phone: 033-2473-0350 (or
-3491/0492/6793); FAX: 033-2473-5197 (or 033-2472-3967); email:
Ms. Talukdar is now at the Wageningen University & Research Centre, Botanical Centrum Gebouw 352, The Netherlands
Purpose: To identify olfactomedin domain containing proteins, which are expressed in the eye and have similarity to myocilin, to test as potential candidates for eye diseases. Most of the mutations in myocilin causing primary open angle glaucoma are located in the olfactomedin domain. In vitro experiments demonstrated interaction between optimedin and myocilin through the conserved olfactomedin domains of the proteins in rats, and it was speculated that optimedin might have a role in the pathogenesis of ocular disorders. Hence, we aimed to identify myocilin related human proteins having conserved olfactomedin domains with potential to interact between them and examine the expression patterns in the eye by bioinformatics approaches. This endeavor would have the potential to identify new candidate genes for eye diseases in general and glaucoma in particular to be tested by wet-lab experiments.
Methods: Proteins with homology to myocilin were selected by BLASTp at the NCBI server. cDNA sequences and corresponding genomic contigs were retrieved. Pairwise BLAST was done to investigate the gene structure. The human EST database and NEIBank were searched against the selected cDNAs to look for tissue specific expression of the transcripts.
Results: The study led to the identification of three groups of proteins encoded by three different genes; Noelin 1 (9q34.3), Noelin 2 (19p13.2), and Noelin 3 (1p22) encompassing 45,575 bp, 82,679 bp, and 1,93,421 bp of the genomic sequence, respectively. Genomic structures, alternate usage of exons, and molecular evolution of the Noelins were determined. Similar structures of the genes, splicing patterns and high levels of homology shed light on the relatedness and molecular evolution of this group of olfactomedin related proteins. Strikingly, however, Noelin 1 and Noelin 3 were found to be expressed as multiple splice variants while only a single spliced transcript could be identified for Noelin 2. A human EST database search suggested the expression of all three Noelin genes in the brain but only two (Noelin 1 and Noelin 2) in the eye despite experimental evidence for expression of Noelin 3 in ocular tissue. Myocilin was determined to have similar levels (60-61%) of homology with all three Noelin gene products (Noelin 1_v1, Noelin 2_v1, and Noelin 3_v1) at the conserved olfactomedin domains.
Conclusions: Mammalian Noelin 1 evolved from its precursor, followed by evolution of Noelin 3 and Noelin 2 by gene duplication events. Myocilin might have evolved from Noelin 2 by gene duplication followed by exon fusion. Noelin 1 and Noelin 2 could be tested as candidate genes for eye diseases based on their expressions in the eye and shared olfactomedin domains with Myocilin in the C-termini of the respective proteins.
Eye and vision research provides a unique opportunity for bioinformatics studies. The small amount of different tissue types within the eye makes it very difficult to work with the mRNAs to decipher the complex pathways and the interplay of proteins. For example, trabecular meshwork (TM) tissue is the most important part of the eye for studies on the pathogenesis of primary open angle glaucoma (POAG), but it forms a very small part of the eye. Hence, it is often very difficult to have pure TM cells free from contamination of other types of cells. In public databases like GeneCards, where tissue specific expression of the genes are recorded, nothing is mentioned about the expression levels of genes in eye tissues. The lack of information about the tissue-specific expression of eye proteins dictates the need for an in silico approach to understand eye diseases at the molecular level. The National Eye Institute has made a considerable effort to catalogue the genes expressed in different tissues in the eye coupled with the informatics tools. The aim of the (NEIBank) project is to use cDNA libraries that represent as closely as possible the transcript profile of the specialized tissues of the eye. While cultured cells can be powerful experimental tools it is clear that they differ significantly from the tissue from which they were derived . As just one of many known examples (both published and unpublished), MHC gene expression seems to be absent in intact lens but is present in cultured lens cells that are used as models for the lens .
A major effort has been made to use bioinformatics to assemble, organize, and present the NEIBank EST data, to identify and group the high quality sequences, and to remove the various classes of poor quality, non-mRNA contaminants and chimeric clones. This has evolved into a rules-based procedure named GRIST (GRouping and Identification of Sequence Tags)  that uses sequence matches generated by BLAST programs  and extracts information from GenBank, UniGene, and other databases. The effort resulted in identification of large non-redundant transcripts, novel genes and splice variants [1,3,5-8]. In a recent effort as part of the NEIBank project, the characterization of gene expression patterns in the human trabecular meshwork and identification of the candidate genes for glaucoma was attempted and 3,459 independent TM expressed clones were obtained . These clones were grouped in 1,888 clusters, potentially representing individual expressed genes. Transcripts for the myocilin gene formed the third most abundant cluster in the TM collection and several other genes implicated in glaucoma (PITX2, CYP1B1, and optineurin) were also represented. Many other novel genes were also identified in this effort to elucidate the molecular mechanisms involved in glaucoma and other TM diseases . In the context of the vast opportunities in vision research that have been created by the human genome sequence and the bioinformatic tools, attempts were made in this manuscript to identify myocilin related olfactomedin family genes.
Olfactomedins are secreted polymeric glycoproteins of unknown function with evolutionarily conserved C-terminal motifs. Olfactomedin was originally identified as the major component of the mucus layer that surrounds the chemosensory dendrites of olfactory neurons . Homologues were subsequently also found in other tissues, including the brain, and in species ranging from Caenorhabditis elegans to Homo sapiens [11,12]. The molecular evolution of olfactomedin has also been studied by examining its phylogenetic history to identify conserved structural motifs. A study based on the comparison of protein sequences revealed that the evolution of the N-terminal half of the molecule involved extensive insertions and deletions while the C-terminal region evolved mostly through point mutations, suggesting evolutionary constrains in the C-terminal region for a predictably important functional role .
The first link between an olfactomedin related protein and a human disease was established by the discovery in 1997 that defects in myocilin caused POAG . Myocilin was originally described as a myosin-like acidic protein (isoelectric point 5.2) expressed predominantly in the photoreceptor cells of the retina and localized particularly in the rootlet and basal body of the connecting cilium, hence the protein was named myocilin . Polansky et al.  identified the same protein while studying the effects of steroids on trabecular meshwork cells in culture. In the eye, the trabecular meshwork cells help regulate eye pressure by controlling the drainage of fluid from the eye as new fluid is produced. The cultured cells, when treated with steroids, secreted the same protein which Nguyen et al.  called TIGR (for trabecular meshwork inducible glucocorticoid response protein). Myocilin is exclusively found in mammals and it has an evolutionarily conserved olfactomedin-like domain in its C-terminal region, including it in the family of olfactomedin-like proteins.
Mutations in the Myocilin gene have been associated with 3-4% in all POAG cases and a major portion of juvenile open angle glaucoma (JOAG) in different populations [13,17]. About 50 different mutations have been detected in different population groups, summarized in a recent review . Unfortunately, the pathogenesis of the disease continues to elude us. Adult onset POAG is also inherited as a non-Mendelian trait, whereas juvenile onset POAG exhibits mostly autosomal dominant inheritance . It has been observed that among the 3 exons of Myocilin, the majority of mutations (90%) are clustered in exon 3 while a few (10%) are located in exon 1 and none in exon 2. We have recently proposed that mammalian myocilin carries the signatures of two different primordial proteins, myosin-like and olfactomedin-like, that may have been fused in the course of evolution after arthropods to produce MYOC .
Like myocilin, another olfactomedin-like protein, optimedin, is expressed in the combined tissues of the eye angle (i.e., trabecular meshwork, iris, and ciliary body), the retina, and the brain of rats . In human, this protein has been reported to be expressed in the trabecular meshwork and neural retina based on RT-PCR analysis . In vitro experiments demonstrate interaction between optimedin and myocilin through the conserved olfactomedin domain in rats , and there has been speculation that optimedin might have a role in eye pathogenesis. This observation prompted us to investigate the olfactomedin-like protein, with homology to myocilin and expressions in the eye, as potential candidates for eye diseases. Since glaucoma is transmitted as a monogenic disorder as well as a complex disease, the candidate genes could be examined for the causation of the disease either singly or in combination with other candidates. It is worth mentioning that recently a "digenic" form of glaucoma  and a "tri-allelic" form of Bardet-Biedl Syndrome have been reported .
Here we have identified proteins with high levels of homology with mammalian myocilin, examined evolutionary relationship between these proteins, determined structures of the genes, characterized the splice variants and their relative expression levels by the analysis of the EST database, and discussed possible roles of these olfactomedin related proteins in pathogenesis of ocular disorders.
Identification of human proteins that have sequence homology to myocilin
Human myocilin (NP_000252) was used as query and the human protein database was searched by BLASTp  at the NCBI server to identify proteins that have sequence homology to the query. A total 119 hits were found. We individually aligned each of these proteins with myocilin and selected those which had an olfactomedin domain in the C-terminal region similar to myocilin and homology with myocilin extending beyond the conserved olfactomedin domain. We called these proteins "myocilin related". We excluded all the truncated protein entries which did not have either the start or stop codons defined, and the much larger lathrophilin group of proteins harboring an olfactomedin region at the N-terminal instead of the C-terminal region (lectomedin [NP_056051], lathrophilin [NP_036434], and others) . The total number of proteins and corresponding cDNAs that were selected for further analysis are summarized in Table 1.
Phylogenetic analyses of human myocilin, noelin 1_v1, noelin 2_v1, and noelin 3_v1
Phylogenetic analysis of the amino acid sequences of myocilin, noelin 1_v1, noelin 2_v1, and noelin 3_v1 was carried out at the DDBJ server. The sequences were aligned using the multiple alignment tool ClustalW . An unrooted phylogenetic tree was estimated using the BLOSUM matrix  and the neighbor-joining tree-building algorithm . Bootstrap values based upon 1,000 iterations provided estimates of statistical support for the tree. We also compared the relative homology between three noelins mentioned above (noelin 1_v1, noelin 2_v1, and noelin 3_v1) and myocilin by pairwise BLAST .
Pairwise BLAST and sequence analysis for determination of gene structure
The nucleotide links corresponding to the different protein entries in the database were searched for, and the corresponding cDNA sequences as well as the genomic contig were retrieved for further analysis. Pairwise BLAST  was done for the genomic sequences with each of the cDNAs using appropriate accession numbers as identifiers. By comparing the co-ordinates of the "query" and the "subject" in the output data where homology was observed, the location and size of the coding sequence (exons) were determined. The intervening sequences in the genome between two adjacent exons, with no homology with the "query" cDNA, represented the introns. Within each of the putative introns the splice junctions (i.e., GT and AG at the 5' and 3' ends of the DNA fragment, respectively) were identified to ensure that genomic DNA fragment truly represented an intron.
Parallel to determining manually the structures of the Noelin genes and usage of alternate exons in the splice variants, we examined the quality of the same set of data available from the relevant web sites ENSEMBL and NCBI AceView. Collating the data from these sites would have been more efficient. However, we found errors in the data from both sites, some of which included (but not limited to) the following: (a) In ENSEMBL, the splice variant noelin 1_v2 does not exist. (b) In AceView there are five different isoforms for noelin 2 among which, except for the "a" form corresponding to noelin 2_v1, all others are incomplete entries (do not have stop codons). Similar errors were also observed for noelin 3. Hence, for this study, analysis of gene structure and usage of different exons in the splice variants were done manually as mentioned above.
Search of the human EST database for the expression pattern of Noelin genes
A qualitative estimation for the expression of the Noelin genes in the eye and brain was undertaken by a search of the human EST database using cDNAs of different splice variants of the genes as "query" sequence. To make sure that estimation of the expression of a targeted splice variant was not influenced by overlapping sequence of other splice-variants, only those ESTs with an e-value (expect value) of 0.0 in the BLASTn analysis were taken. It is worthwhile to mention here that we looked for information on expression patterns of genes of interest in the corresponding GeneCards. Contrary to our search result from the EST database the GeneCards for Noelin 1 and Myocilin genes did not have any information on expression of them in the eye. Hence we searched the NCBI human EST database database to retrieve the maximum information available regarding expression of genes of interest. Since we were interested also to examine expression of these genes in specific tissues of the eye, we also searched the databases at NEIBank by NEIBLAST.
Identification of "myocilin related" proteins
Myocilin (NP_000252) was used as a query to search for proteins, which were highly homologous to it from the NCBI database by BLASTp . On the basis of homology to N- and C-termini of myocilin, three groups of proteins were selected under the following names; (a) Olfactomedin 1/Noelin 1 and Pancortin isoforms; (b) Noelin 2; and (c) Noelin 3 isoforms (Table 1).
A preliminary search of the human EST database, using the available cDNAs corresponding to these "myocilin related" proteins as query, suggested the expression of Noelin 1 and Noelin 2 in eye tissues. No human EST from eye origin was identified for Noelin 3. The results were also consistent when we searched against transcripts of the eye in NEIBank taking different isoforms of Noelin 1, Noelin 2, and Noelin 3 as query. Detailed analysis of the EST database is provided later in this section.
Homology between noelin 1, noelin 2, noelin 3, and myocilin
Among the isoforms of each of the three noelins, those containing the conserved C-terminal olfactomedin domain and having at least 30% identity with myocilin were used for comparison (Table 2). Three larger isoforms of Noelin 1 (noelin 1_v1, noelin 1_v2, and noelin 1_v3) and two of Noelin 3 (noelin 3_v1 and noelin 3_v2), that differed at the N-termini by only the 5' end exons (Table 1), had a similar homology with myocilin (data not shown). Hence, as an example, homology between noelin 1_v1, noelin 2_v1, noelin 3_v1, and myocilin is shown in Table 2. It was observed that there was a higher level of identity (59-64%) between noelin 1_v1, noelin 2_v1, and noelin 3_v1 with no gap required for the alignment. Among these three proteins, noelin 1_v1 and noelin 3_v1 have a slightly higher level of identity (64%) compared to other combinations (59-61%). On the other hand, myocilin (NP_000252, 504 aa) has a much lower level of identity (30-32%) with the other 3 proteins and requires insertion of a gap (6-13%) to maximize homology (Table 2). As shown in Figure 1, multiple alignment of noelin 1_v1, noelin 2_v1, noelin 3_v1, and myocilin by ClustalW revealed that C-termini containing olfactomedin domains had much higher homology compared to the N-termini.
To examine the evolutionary relationship between these proteins, phylogenetic analysis was done using the primary sequences of these proteins (Figure 2). Thus, a neighbor-joining tree constructed using the entire sequence of the proteins suggests that noelin 1_v1 and noelin 3_v1 are closely related and these two proteins, along with noelin 2_v1 and myocilin, evolved from the same root. To further investigate the molecular evolution of these proteins, the structure of the genes needed to be characterized.
Structure of the Noelin 1 gene
Five different isoforms of Noelin 1 were identified among which only four had cDNA submissions in the database (Table 1). The genomic contig (NT_019501) for Noelin 1, located at 9q34.3, was retrieved from the NCBI database and used for pairwise BLAST against each cDNA mentioned above. The locations of the exons in the gene containing the coding sequence represented by each cDNA were identified and boundaries of intervening sequences (introns) flanked by GT and AG at the 5' and 3' ends, respectively, were delineated (Figure 3A). It was observed that, except for the first 50 amino acids, noelin 1_v3, without any cDNA entry in the database, had complete identity with noelin 1_v1. It was also observed that this stretch of 50 amino acids of noelin 1_v1 at the N-terminal had 89% identity with the first 53 residues of noelin 1_v3 allowing for gaps in the alignment. Based on this information and by inspecting the genomic contig sequence mentioned above, the region of the genome coding for the 50 residues of noelin 1_v1 was identified. Thus, the Noelin 1 gene (BK001427) was found to contain a total of ten exons covering 45 kb of genomic DNA, and up to six exons were used by any particular splice variant as shown in Figure 3A. It was interesting to note that the fourth and last exon used in noelin 1_v4 contained only single codon and the remaining nucleotides represented untranslated regions.
Structure of the Noelin 2 gene
The gene structure of Noelin 2 was determined by pairwise BLAST of the only cDNA (NM_058164) available for the gene with the retrieved genomic contig (NT_011295) located at 19p13.2. The coding sequence of Noelin 2 was observed to be divided into 6 exons spanning 82 kb of the genomic sequence (BK001428, Figure 3C).
Structure of Noelin 3 gene
For each cDNA of six different isoforms of Noelin 3 (Table 1), pairwise BLAST searches were done against the retrieved genomic sequence at 1p22 (NT_028050, corresponding to OLFM3). The locations of the exons in the gene (BK001429) containing the coding sequence represented by each cDNA and the introns were identified as mentioned above (Figure 3B). It was interesting to note that six splice variants could be divided into two subgroups based on the gene product, the longer forms (383-478 aa) and the shorter forms (140-235 aa), each consisting of 3 proteins. The members of both the subgroups contained a non-homologous stretch of amino acids at the N-termini gained by the usage of unique exons. The frame for translation did not alter due to the usage of different exons since in every case the first intervening sequence split the coding sequence, not within, but between adjacent codons at the splice junction. The splice variants (noelin 3_v4, noelin 3_v5, and noelin 3_v6), which were predicted to code for shorter proteins, contained sequences from an internal exon (number 7) that would code for 18 amino acids followed by a codon for the termination of translation resulting in the elimination of the entire olfactomedin domain from the gene product.
Molecular evolution of noelin 1, noelin 2, noelin 3, and myocilin
A comparison of the structures of the Noelin genes shows striking similarities: (a) Some of the exons at comparable locations within the genes have similar sizes (Figure 3). (b) Introns split the coding sequences between two adjacent codons at all splice junctions except one, at the 3' region of the genes where two adjacent exons share one triplet codon separated by the intervening intron between them. (c) Adjacent pairs of codons intervened by the last introns of 3 Noelin genes encode for the same pair of amino acids (i.e., arginine and valine). Conservation of the sizes of the exons and the splice frames provided compelling evidence that the 3 Noelin genes evolved by gene duplication.
Comparison of Xenopus noelin with 3 human noelins (i.e., noelin 1_v1, noelin 2_v1, and noelin 3_v1), it was found that human noelin 1_v1 demonstrated the highest homology (92% identity as compared to 59-64% for other proteins). This observation suggested that, among mammalian Noelin genes, Noelin 1 originally evolved from its ancestral precursor gene. Hence, within the mammalian lineage, Noelin 2 and Noelin 3 must have evolved from Noelin 1 by gene duplication. It was further observed that the N-terminal region of noelin 1_v1 (1-209 amino acids) had 69% and 77% homology with the corresponding regions of noelin 2_v1 (1-195 amino acids) and noelin 3_v1 (1-199 amino acids), with 3 and zero gaps required for alignment. On the other hand, the more conserved olfactomedin domain located in the C-terminal region of noelin 1_v1 (210-467 amino acids) had a similar level of homology (81-82%) with both noelin 2_v1 (196-454 amino acids) and noelin 3_v1 (200-458 amino acids) with no gaps required for the best alignment. It is noteworthy that the terminal codons of all internal exons are highly conserved between Noelin 2 and Noelin 3 (Figure 3).
The comparison of myocilin (amino acid numbers 246-502) with the noelins at the C-terminal region (harboring the olfactomedin domain) showed a similar level (60-61%) of homology but the N-terminal region did not show any significant homology as judged by BLASTp results. However, a manual comparison of the same region of myocilin (amino acid numbers 1-201) revealed that it had a higher level of identity with noelin 2_v1 (46 aa) relative to noelin 1_v1 (33 aa) and noelin 3_v1 (29 aa). It is interesting to note that the genomic region of Noelin 2 and Myocilin encoding for a myosin-like coiled-coil structure at the N-terminal region of the proteins are divided into 4 exons (numbers 1-4) in the former but contained in a single exon in the latter gene. Similarly, the genomic region of Noelin 2 encoding an olfactomedin-like domain is split into two exons (numbers 5 and 6) while the corresponding genomic region of Myocilin encoding for the same protein domain is represented by a single exon. Also, the number of amino acids encoded by both the genes for this region is 261 (Figure 4). The last nucleotide (G) of both exon 4 of Noelin 2 and exon 1 of Myocilin represents the first base of a codon. Similarly, the first two nucleotides of exon 5 of Noelin 2 (GC) and exon 3 of Myocilin (GA) represent the second and third bases of a codon. If exons 1 and 3 of Myocilin were spliced together, the same amino acid (glycine) would have been coded at the junction as for the splicing of exon 4 and 5 of Noelin 2. Based on these observations, we propose that Myocilin has most likely evolved from Noelin 2 by gene duplication and exon fusion (Figure 5).
Expression profile of Noelins based on EST database
After searching the literature for experimental data on expression of Noelin genes in human, little information was found. Originally, expression of Noelin 3 in the human retina, trabecular meshwork cells, and brain has been claimed based on RT-PCR . Recently, the same group reported Noelin 1 and Noelin 2 expression in the human retina and brain using the same technique . Among other species, the expression of Noelin 1 in the neural crest, cranial ganglia, and brain of xenopus, chicken, and mouse has been reported . In the rat, the gene was reported to be expressed in the brain and eye . Noelin 3 expression in retina and brain of rats has been shown by northern analysis. Due to a paucity of the expression data on human Noelin genes, we searched the human EST database as described above (Methods). From the collated data set from human EST hits, it appears that all known Noelin 1 splice variants and Noelin 2 expression occurred primarily in the brain and eye, while the Noelin 3 expression was evident in the brain and other tissues but not in the eye (Table 3). The data only provides a qualitative assessment regarding the expression of Noelin genes despite a variable number of clones in Table 3 because the ESTs in the database represent a collection of data from non-uniform heterogeneous sources. We searched the NEIBank repository of EST data to identify tissue specific expression of the Noelin genes within the eye, and retrieved one transcript from both retina and choroid corresponding to Noelin 1 (i.e., Noelin 1_v1, Noelin 1_v2, and Noelin 1_v3). The present status of the database limits the scope of investigation to any further depth. Nevertheless, such in silico approaches have the potential to provide a priori information regarding the status of expression which must be tested and confirmed by wet-lab experiments.
Possible functional implications of noelins in the eye and its pathogenesis
It is interesting to note that among the olfactomedin-like proteins, in addition to myocilin, at least two Noelins (Noelin 1 and Noelin 2) are expressed in the eye though the reported expression of Noelin 3 in the eye needs to be re-examined. It has been demonstrated by in vitro experiments that myocilin forms a homodimer through its leucine zipper domain  and a heterodimer with noelin 3 (optimedin) by interaction through the olfactomedin domains . The olfactomedin regions between the 3 Noelin gene products have high levels of homology (79-82%). Hence, it is reasonable to predict that myocilin is likely to interact in vivo with the noelin 1 and noelin 2 isoforms through the olfactomedin domains provided that the interacting genes are expressed in the same location. Such interaction may have functional implications and/or pathology related to the aberration of such interactions. However, experimental evidence for co-expression of Myocilin and Noelins in the same ocular tissue in human is needed.
Myocilin, a mammalian olfactomedin-like protein, is expressed in the trabecular meshwork of the eye and when defective causes primary open angle glaucoma (POAG). The biological function of myocilin is not well understood. The natural null mutant in humans does not show POAG  and the myocilin knock-out mouse does not have eye related problems . The dominant transmission of POAG in families with mutant myocilin suggests a gain of function of the defective protein. Human myocilin bears the highest level of homology with the Xenopus noelin, which is also an olfactomedin-like protein, among sub-mammalian species . It has been reported (based on in situ hybridization) that Xenopus Noelin is expressed only in post-mitotic neural tissues, the cranial ganglia, eye, and neural tube . A recent report predicted that optimedin (noelin 3), another olfactomedin domain containing protein with homology to myocilin, could be involved in pathogenesis of ocular disorders based on its expression in the mammalian (rat and human) eye and its interaction with myocilin . Thus, we reasoned that it is likely that other "myocilin related" proteins might exist, which when defective will lead to pathology within the eye either singly or through interaction with other related proteins. The identification of three such groups of proteins, by database search including noelin 3, with homology to myocilin under various names was confusing to relate to, and to study evolution, expression, and functional implication of these proteins. Thus we have proposed a systematic use of existing nomenclatures based on characterization of the proteins as splice variants of three different genes, following the guidelines of HUGO.
Determination of the structure of the genes and a detailed analysis of the splice junction sequences helped to sort the most likely route of molecular evolution of these proteins. It is intriguing to note that two Noelin genes (Noelin 1 and Noelin 3) have multiple splice variants, but no splice variant is yet known for Noelin 2. It is noteworthy that our effort to find such gene products from the EST database did not result in any hits. However, AceView at the NCBI server includes a Noelin 2 isoform, which is an incomplete submission missing the complete C-terminus of the protein, hence it is not listed in Table 1. Comparison of this noelin 2 isoform with noelin 1 and noelin 3 proteins suggests that the full-length isoform of noelin 2 is homologous to noelin 1_v2 and noelin 3_v2 (data not shown). Since the only Noelin 2 protein identified so far has both structural and sequence homology with Noelin 1_v1 and Noelin 3_v1 among all other variants, we intend to identify Noelin 2 as Noelin 2_v1 for proper nomenclature of yet unidentified other splice variants (if any) of the gene. It has been proposed that extensive modification of the N-terminal half and the acquisition of a C-terminal SDEL endoplasmic-reticulum-targeting sequence may have enabled olfactomedin to adopt new functions in the mammalian central nervous system . The identification of a large number of splice-variants for both Noelin and Noelin 3 and the expression of these variants at the RNA level should lead to experiments to identify these proteins in the brain and decipher their functional role in the neural network.
While gene duplication is a well-known means for the creation of a multigene family, and fusion of exons is known to be responsible for creation of splice variants and pseudogenes, the proposed fusion of exons for the evolution of Myocilin from its precursor is not yet a described mechanism of gene evolution in the literature, to the best of our knowledge. However, the fusion of two proteins encoded by two different genes in lower organisms to a single product encoded by a single gene in higher organisms has been reported . One example among many is the fusion of two E. coli proteins (γ-glutamyl phosphate reductase and glutamate 5-kinase) to a single human protein (δ-1-pyrroline-5-carboxylate synthetase).
Experimental evidence for the expression of Noelin genes in neuronal tissues in the amphibia, chicken, rat, and mouse is consistent with the expression pattern of Noelin in human as revealed by EST database analysis. Interestingly, the reported expression of Noelin 3 in the human eye is not supported by the EST data available at both NCBI and NEI databases. Further, we attempted to examine tissue specific expression of Noelin in the eye by searching the NEIBank EST database. However, this strategy did not reveal much new information. We predict that the tissue specific dataset at the NEIBank is not yet large enough to include genes not expressed at high level. To test this possibility, we looked in the NEIBank for ESTs of the genes that are expressed in retinal photoreceptor cells and which cause retinal diseases when defective. However, we did not find significant hits for many of these genes in the retinal database (NEI human retina unnormalized). The elegant experiments done to show in vitro that noelin 3_v1 forms a heterodimer with myocilin through the interaction of the olfactomedin domains of these two proteins is interesting. The olfatomedin domain of myocilin has a similar level of homology (60%) with the protein domain in all 3 noelins. Hence, it is reasonable to hypothesize that myocilin will also make heterodimers with noelin 1_v1 and noelin 2_v1 independently, which need to be tested first by in vitro experiments. The expression of Noelin 1 and Noelin 2 in the eye and the possibility of interaction of both gene products with myocilin raises the real possibility that these two genes might be candidates for eye diseases. In mammals, noelin has been found in the neural plate and neural crest, as well as in the cranial ganglia. It is also known that the developmental anomalies that are frequently associated with glaucoma are mainly caused by the abnormal differentiation of neural crest cells . Further support for the hypothesis comes from the reported similar chromosomal location (9q34) of Noelin 1 and LMX1B genes. LMX1B has been implicated in causing nail-patella syndrome, a developmental anomaly that co-segregates with open angle glaucoma in some families.
Unlike retinitis pigmentosa (RP), other eye diseases are mostly sporadic with the plausible effects of multiple genetic loci and environments. Despite the limited transmission of POAG in families following monogenic traits due to six identified loci [35-40], it is largely sporadic and is transmitted as a complex disease with no knowledge of any biochemical pathway to choose candidate genes from. The identification and inclusion of the new genes in the repertoire of the candidates for eye diseases provides the opportunity to test these genes for the causation of monogenic disorders. On the other hand, the identification of single nucleotide polymorphism (SNP) patterns in these genes from patients showing complex forms of the disease followed by association studies might shed new light in the understanding of common but more challenging eye disorders. Computational analysis done by data mining and presented in this article provides an opportunity to test new hypotheses by wet-lab experiments that are directly related to the deciphering of human genetic diseases.
In summary, the major observations in this manuscript not previously described are: (a) A complete characterization of the genes for three olfactomedin related proteins which have been specifically selected on the basis of high homology with myocilin. (b) Identification of all possible splice variants of these 3 genes. (c) Evolutionary relatedness between these 3 genes at the molecular level as revealed by the analysis of the splice junctions and splice variants of these 3 genes. (d) Qualitative assessment regarding expression of different splice variants of the genes available in the EST database.
The financial help from Council of Scientific and Industrial Research, India for the study as well as pre-doctoral fellowship to AM and AB is gratefully acknowledged. The study has been partially supported by a grant to KR from Indian Council of Medical Research, Government of India.
1. Wistow G. A project for ocular bioinformatics: NEIBank. Mol Vis 2002; 8:161-3 <http://www.molvis.org/molvis/v8/a22/>.
2. Shaughnessy M, Wistow G. Absence of MHC gene expression in lens and cloning of dbpB/YB-1, a DNA-binding protein expressed in mouse lens. Curr Eye Res 1992; 11:175-81.
3. Wistow G, Bernstein SL, Touchman JW, Bouffard G, Wyatt MK, Peterson K, Behal A, Gao J, Buchoff P, Smith D. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database. Mol Vis 2002; 8:164-70 <http://www.molvis.org/molvis/v8/a23/>.
4. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215:403-10.
5. Wistow G, Bernstein SL, Wyatt MK, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of adult human lens for the NEIBank Project: over 2000 non-redundant transcripts, novel genes and splice variants. Mol Vis 2002; 8:171-84 <http://www.molvis.org/molvis/v8/a24/>.
6. Wistow G, Bernstein SL, Ray S, Wyatt MK, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of adult human iris for the NEIBank Project: steroid-response factors and similarities with retinal pigment epithelium. Mol Vis 2002; 8:185-95 <http://www.molvis.org/molvis/v8/a25/>.
7. Wistow G, Bernstein SL, Wyatt MK, Ray S, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of human retina for the NEIBank Project: retbindin, an abundant, novel retinal cDNA and alternative splicing of other retina-preferred gene transcripts. Mol Vis 2002; 8:196-204 <http://www.molvis.org/molvis/v8/a26/>.
8. Wistow G, Bernstein SL, Wyatt MK, Fariss RN, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K. Expressed sequence tag analysis of human RPE/choroid for the NEIBank Project: over 6000 non-redundant transcripts, novel genes and splice variants. Mol Vis 2002; 8:205-20 <http://www.molvis.org/molvis/v8/a27/>.
9. Tomarev SI, Wistow G, Raymond V, Dubois S, Malyukova I. Gene expression profile of the human trabecular meshwork: NEIBank sequence tag analysis. Invest Ophthalmol Vis Sci 2003; 44:2588-96.
10. Yokoe H, Anholt RR. Molecular cloning of olfactomedin, an extracellular matrix protein specific to olfactory neuroepithelium. Proc Natl Acad Sci U S A 1993; 90:4655-9.
11. Danielson PE, Forss-Petter S, Battenberg EL, deLecea L, Bloom FE, Sutcliffe JG. Four structurally distinct neuron-specific olfactomedin-related glycoproteins produced by differential promoter utilization and alternative mRNA splicing from a single gene. J Neurosci Res 1994; 38:468-78.
12. Karavanich CA, Anholt RR. Molecular evolution of olfactomedin. Mol Biol Evol 1998; 15:718-26.
13. Stone EM, Fingert JH, Alward WL, Nguyen TD, Polansky JR, Sunden SL, Nishimura D, Clark AF, Nystuen A, Nichols BE, Mackey DA, Ritch R, Kalenak JW, Craven ER, Sheffield VC. Identification of a gene that causes primary open angle glaucoma. Science 1997; 275:668-70.
14. Kubota R, Noda S, Wang Y, Minoshima S, Asakawa S, Kudoh J, Mashima Y, Oguchi Y, Shimizu N. A novel myosin-like protein (myocilin) expressed in the connecting cilium of the photoreceptor: molecular cloning, tissue expression, and chromosomal mapping. Genomics 1997; 41:360-9.
15. Polansky JR, Fauss DJ, Chen P, Chen H, Lutjen-Drecoll E, Johnson D, Kurtz RM, Ma ZD, Bloom E, Nguyen TD. Cellular pharmacology and molecular biology of the trabecular meshwork inducible glucocorticoid response gene product. Ophthalmologica 1997; 211:126-39.
16. Nguyen TD, Chen P, Huang WD, Chen H, Johnson D, Polansky JR. Gene structure and properties of TIGR, an olfactomedin-related glycoprotein cloned from glucocorticoid-induced trabecular meshwork cells. J Biol Chem 1998; 273:6341-50.
17. Alward WL, Fingert JH, Coote MA, Johnson AT, Lerner SF, Junqua D, Durcan FJ, McCartney PJ, Mackey DA, Sheffield VC, Stone EM. Clinical features associated with mutations in the chromosome 1 open-angle glaucoma gene (GLC1A). N Engl J Med 1998; 338:1022-7.
18. Ray K, Mukhopadhyay A, Acharya M. Recent advances in molecular genetics of glaucoma. Mol Cell Biochem 2003; 253:223-31.
19. Wiggs JL, Allingham RR, Vollrath D, Jones KH, De La Paz M, Kern J, Patterson K, Babb VL, Del Bono EA, Broomer BW, Pericak-Vance MA, Haines JL. Prevalence of mutations in TIGR/Myocilin in patients with adult and juvenile primary open-angle glaucoma. Am J Hum Genet 1998; 63:1549-52.
20. Mukhopadhyay A, Gupta A, Mukherjee S, Chaudhuri K, Ray K. Did myocilin evolve from two different primordial proteins? Mol Vis 2002; 8:271-9 <http://www.molvis.org/molvis/v8/a34/>.
21. Torrado M, Trivedi R, Zinovieva R, Karavanova I, Tomarev SI. Optimedin: a novel olfactomedin-related protein that interacts with myocilin. Hum Mol Genet 2002; 11:1291-301.
22. Vincent AL, Billingsley G, Buys Y, Levin AV, Priston M, Trope G, Williams-Lyn D, Heon E. Digenic inheritance of early-onset glaucoma: CYP1B1, a potential modifier gene. Am J Hum Genet 2002; 70:448-60.
23. Katsanis N, Ansley SJ, Badano JL, Eichers ER, Lewis RA, Hoskins BE, Scambler PJ, Davidson WS, Beales PL, Lupski JR. Triallelic inheritance in Bardet-Biedl syndrome, a Mendelian recessive disorder. Science 2001; 293:2256-9.
24. Krasnoperov V, Bittner MA, Holz RW, Chepurny O, Petrenko AG. Structural requirements for alpha-latrotoxin binding and alpha-latrotoxin-stimulated secretion. A study with calcium-independent receptor of alpha-latrotoxin (CIRL) deletion mutants. J Biol Chem 1999; 274:3590-6.
25. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994; 22:4673-80.
26. Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 1992; 89:10915-9.
27. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 1987; 4:406-25.
28. Tatusova TA, Madden TL. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett 1999; 174:247-50. Erratum in: FEMS Microbiol Lett 1999; 177:187-8.
29. Moreno TA, Bronner-Fraser M. Neural expression of mouse Noelin-1/2 and comparison with other vertebrates. Mech Dev 2002; 119:121-5.
30. Fautsch MP, Johnson DH. Characterization of myocilin-myocilin interactions. Invest Ophthalmol Vis Sci 2001; 42:2324-31.
31. Lam DS, Leung YF, Chua JK, Baum L, Fan DS, Choy KW, Pang CP. Truncations in the TIGR gene in individuals with and without primary open-angle glaucoma. Invest Ophthalmol Vis Sci 2000; 41:1386-91.
32. Kim BS, Savinova OV, Reedy MV, Martin J, Lun Y, Gan L, Smith RS, Tomarev SI, John SW, Johnson RL. Targeted Disruption of the Myocilin Gene (Myoc) Suggests that Human Glaucoma-Causing Mutations Are Gain of Function. Mol Cell Biol 2001; 21:7707-13.
33. Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. Detecting protein function and protein-protein interactions from genome sequences. Science 1999; 285:751-3.
34. Sarfarazi M. Recent advances in molecular genetics of glaucomas. Hum Mol Genet 1997; 6:1667-77.
35. Sheffield VC, Stone EM, Alward WL, Drack AV, Johnson AT, Streb LM, Nichols BE. Genetic linkage of familial open angle glaucoma to chromosome 1q21-q31. Nat Genet 1993; 4:47-50.
36. Stoilova D, Child A, Trifan OC, Crick RP, Coakes RL, Sarfarazi M. Localization of a locus (GLC1B) for adult-onset primary open angle glaucoma to the 2cen-q13 region. Genomics 1996; 36:142-50.
37. Wirtz MK, Samples JR, Kramer PL, Rust K, Topinka JR, Yount J, Koler RD, Acott TS. Mapping a gene for adult-onset primary open-angle glaucoma to chromosome 3q. Am J Hum Genet 1997; 60:296-304.
38. Trifan OC, Traboulsi EI, Stoilova D, Alozie I, Nguyen R, Raja S, Sarfarazi M. A third locus (GLC1D) for adult-onset primary open-angle glaucoma maps to the 8q23 region. Am J Ophthalmol 1998; 126:17-28.
39. Wirtz MK, Samples JR, Rust K, Lie J, Nordling L, Schilling K, Acott TS, Kramer PL. GLC1F, a new primary open-angle glaucoma locus, maps to 7q35-q36. Arch Ophthalmol 1999; 117:237-41.
40. Sarfarazi M, Child A, Stoilova D, Brice G, Desai T, Trifan OC, Poinoosawmy D, Crick RP. Localization of the fourth locus (GLC1E) for adult-onset primary open-angle glaucoma to the 10p15-p14 region. Am J Hum Genet 1998; 62:641-52.
41. Felenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 1985; 39:783-91.