Molecular Vision 2014; 20:1281-1295 <http://www.molvis.org/molvis/v20/1281>
Received 15 March 2012 | Accepted 17 September 2014 | Published 19 September 2014

Electronic medical records and genomics (eMERGE) network exploration in cataract: Several new potential susceptibility loci

Marylyn D. Ritchie,1 Shefali S. Verma,1 Molly A. Hall,1 Robert J. Goodloe,2 Richard L. Berg,3 Dave S. Carrell,4 Christopher S. Carlson,5 Lin Chen,6 David R. Crosslin,7,8 Joshua C. Denny,9,10 Gail Jarvik,7,11 Rongling Li,12 James G. Linneman,13 Jyoti Pathak,14 Peggy Peissig,13 Luke V. Rasmussen,15 Andrea H. Ramirez,10 Xiaoming Wang,9 Russell A. Wilke,9,16 Wendy A. Wolf,17 Eric S. Torstenson,2 Stephen D. Turner,18 Catherine A. McCarty19

1Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA; 2Center for Human Genetics Research, Vanderbilt University, Nashville, TN; 3Biomedical Informatics Research Center, Biostatistics, Marshfield Clinic Research Foundation, Marshfield, WI; 4Group Health Research Institute, Seattle, WA; 5Fred Hutchinson Cancer Research Center, Seattle, WA; 6Ophthalmology, Marshfield Clinic Research Foundation, Marshfield, WI; 7Division of Medical Genetics, University of Washington, Seattle, WA; 8Department of Biostatistics, University of Washington, Seattle, WA; 9Departments of Biomedical Informatics, Vanderbilt University, Nashville, TN; 10Department of Medicine, Vanderbilt University, Nashville, TN; 11Departments of Medicine and Genome Sciences, University of Washington, Seattle, WA; 12Office of Population Genomics, National Human Genome Research Institute, Bethesda, MD; 13Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI; 14Department of Biomedical Informatics, Mayo Clinic College of Medicine, Rochester, MN; 15Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University, Chicago, IL; 16IMAGENETICS at Sanford Medical Center, Fargo, ND and Department of Internal Medicine, University of North Dakota, Fargo, ND; 17Division of Genetics and Genomics, Boston Children's Hospital and Department of Pediatrics, Harvard Medical School, Boston, MA; 18Public Health Sciences, University of Virginia, Charlottesville, VA; 19Essentia Institute of Rural Health, Duluth, MN

Correspondence to: Marylyn Ritchie, Pennsylvania State University, Center for Systems Genomics, The Huck Institutes for the Life Sciences, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, University Park, PA 16802; Phone: (814) 863-5107; FAX: (814) 863-6699; email: marylyn.ritchie@psu.edu

Abstract

Purpose: Cataract is the leading cause of blindness in the world, and in the United States accounts for approximately 60% of Medicare costs related to vision. The purpose of this study was to identify genetic markers for age-related cataract through a genome-wide association study (GWAS).

Methods: In the electronic medical records and genomics (eMERGE) network, we ran an electronic phenotyping algorithm on individuals in each of five sites with electronic medical records linked to DNA biobanks. We performed a GWAS using 530,101 SNPs from the Illumina 660W-Quad in a total of 7,397 individuals (5,503 cases and 1,894 controls). We also performed an age-at-diagnosis case-only analysis.

Results: We identified several statistically significant associations with age-related cataract (45 SNPs) as well as age at diagnosis (44 SNPs). The 45 SNPs associated with cataract at p<1×10−5 are in several interesting genes, including ALDOB, MAP3K1, and MEF2C. All have potential biologic relationships with cataracts.

Conclusions: This is the first genome-wide association study of age-related cataract, and several regions of interest have been identified. The eMERGE network has pioneered the exploration of genomic associations in biobanks linked to electronic health records, and this study is another example of the utility of such resources. Explorations of age-related cataract including validation and replication of the association results identified herein are needed in future studies.

Introduction

Cataract is the leading cause of blindness in the world [1,2], is the leading cause of vision loss in the United States [3], and accounts for approximately 60% of Medicare costs related to vision [4]. Summary prevalence estimates indicate that 17.2% of Americans aged 40 years and older have cataract in either eye and 5.1% have pseudophakia or aphakia (previous cataract surgery). In addition to the implications for healthcare delivery and healthcare costs, cataract has been shown to be associated with falls and increased mortality [5-12], possibly because of associated systemic conditions. Women have a slightly higher risk of having cataract than men [13]. With increased life expectancy, the number of cataract cases and cataract surgeries is expected to increase dramatically unless primary prevention strategies can be developed and successfully implemented.

Several genetic loci have also been linked to cataract as an independent phenotypic trait. An extensive body of literature has addressed the role of genetics in childhood cataract [14], and it has been hypothesized that these same genes may be plausible candidates for age-related cataract [15]. It has been suggested that as many as 40 genes may be involved in age-related cataract [16]. Evidence for a major gene has been identified for cortical [17] and nuclear [18,19] cataract, with heritability estimates of 58% [20] and 48% [21], respectively. A whole genome STR scan conducted in families in Wisconsin revealed a major locus for age-related cortical cataract on chromosome 6p12-q12 [22], and specific candidate genes that have been studied include galactokinase (Gene_ID: 2584; OMIM: 604313) [23,24], apolipoprotein E (Gene_ID: 348; OMIM: 107741) [25], glutathione S-transferase (Gene_ID: 2944; OMIM: 138350)[26], N-acetyltransferase 2 (Gene_ID: 10; OMIM: 612182) [27,28], and estrogen metabolism genes [29]. Two recent studies found an association between the EPHA2 gene (Gene_ID: 1969; OMIM: 176946) and cataract [30,31].

Higher body mass index (BMI) has been shown in many studies to increase risk of cortical and posterior subcapsular (PSC) cataract (odds ratio [OR] = 1.5–2.5) [32-38]. A recent study found that nuclear cataract was not associated with obesity but was associated with the FTO obesity gene (Gene_ID: 79068; OMIM: 610966) in an Asian population [39]. Although familial aggregation studies have shown a potential role for gene and environment interactions in nuclear cataract [40,41], research in this area is limited. The association of glutathione S-transferase with cataract has been shown to be modified by smoking [42] and sunlight exposure [43]. No whole genome association SNP studies of age-related cataract in unrelated individuals have been reported in the medical literature. The purpose of this study was to conduct a genome-wide association study (GWAS) for age-related cataract and to prioritize top hits for further follow-up.

Methods

Phenotypic data

The National Human Genome Research Institute (NHGRI)-funded electronic medical records and genomics (eMERGE) network implemented an electronic phenotype algorithm to select cataract cases and controls [44]. Cataracts as a condition were selected by Marshfield Clinic as its primary eMERGE phenotype, and the algorithm, which uses diagnostic and procedure codes, was developed by the Marshfield Clinic Personalized Medicine Research Project (PMRP) investigators [45]. The five sites in eMERGE-I include Marshfield Clinic, Group Health Research Institute, Vanderbilt University, Mayo Clinic, and Northwestern University. This study included four of the sites: Marshfield Clinic, Group Health Research Institute, Vanderbilt University, and Mayo Clinic. Using an algorithm for a specific phenotype, each participating site extracted study samples for a specific disease or phenotype from the electronic health records (EHR). Once samples had been selected and genotyped, they were available for phenotyping with additional algorithms. Thus, the cataract algorithm was deployed across the network. The cases and the controls had to meet the following inclusion criteria: The cases were age 50 years and older at the time of diagnosis or surgery, and the controls were age 50 years or older at the time of the most recent eye exam and had had an eye exam within the previous 5 years. The controls had no diagnostic codes for cataract or evidence of cataract surgery. The cases were identified as “surgical” or “diagnosis only.” Surgical cases had undergone a cataract extraction in at least one eye. The diagnosis-only cases were required to have either cataract diagnoses on two or more dates or have one diagnosis date and natural language processing and optical character recognition (NLP/OCR) find one or more inclusion cataract terms. Cataract type was extracted from the notes using natural language processing and optical character recognition with validation through manual chart abstraction [45,46].

Genotypic data

Genome-wide genotyping has been performed on approximately 17,000 samples across the network at the Broad Institute and at the Center for Inherited Disease Research (CIDR) using the Illumina 660W-Quad or 1M-Duo Beadchips (CIDR, Baltimore, MD). For this particular study, which includes predominantly individuals of European descent, we used only the Illumina 660W-Quad platform. This platform consists of 561,490 SNPs and 95,876 intensity-only probes. Genotyping calls were made at either CIDR or Broad using BeadStudio version 3.3.7. The eMERGE Cataract dataset pre-quality control (QC) included 7,535 DNA samples and 344 HapMap controls: 3,968 Marshfield Clinic, 2,379 Group Health, 986 Mayo, and 202 Vanderbilt BioVU. Data were cleaned using the eMERGE QC pipeline developed by the eMERGE Genomics Working Group [47]. This process includes evaluation of the sample and marker call rate, gender mismatch, duplicate and HapMap concordance, batch effects, Hardy–Weinberg equilibrium, sample relatedness, and population stratification. After QC, 530,101 SNPs and 7,397 samples were used for analysis (see Table 1 for distribution by site). All genotype data and a detailed QC report for each individual site, as well as the merged eMERGE dataset, can be found on dbGaP, and the detailed eMERGE QC pipeline can be found in [47,48].

Statistical analyses

Single-locus tests of association were performed using PLINK [49] assuming an additive genetic model for all 530,101 SNPs in a total of 7,397 unrelated individuals (5,503 cases and 1,894 controls). We calculated principal components using the EIGENSTRAT program [50] and thus adjusted our analyses for the first three principal components (PCs) to avoid any spurious associations that can be caused due to population stratification. EIGENSTRAT is based on principal components analysis and is used to detect and correct for population stratification in genome-wide association studies. Thus, we present the results of the analysis adjusted by principal components 1–3 (PC1–3).

We also performed an age-at-diagnosis association analysis using cases only. Age at diagnosis is defined as the age when the first cataract diagnosis was made in the electronic health record. We performed unadjusted analysis and adjusted for PC1–3 using linear regression in PLINK. In Table 2 and Table 3, we report all p values <1×10−5. All associations identified by our analyses are suggestive and must be replicated in independent datasets because the signals did not reach a Bonferroni corrected genome-wide statistical significance level.

Results

Figure 1 shows the Manhattan plots for the single locus tests of association for cataract case control adjusted (Figure 1A) and age-at-diagnosis adjusted (Figure 1B) and Figure 2 shows the corresponding QQ plots for each GWAS analysis. Our top hits in the adjusted case-control analysis include gigaxonin (GAN; Gene_ID: 8139, OMIM: 605379; p value = 2.42×10−6), which encodes a member of the cytoskeletal Broad-Complex, Tramtrack, and Bric a brac (BTB/kelch) repeat family. The encoded protein plays a role in neurofilament architecture and is involved in mediating the ubiquitination and degradation of some proteins. Defects in this gene are a cause of giant axonal neuropathy (GAN). Other potential interesting findings include DNER (Gene_ID: 92737; OMIM: 607299; p value = 1.87×10−5), which encodes for the Delta and Notch-like epidermal growth factor-related receptor, and EHHADH (Gene_ID: 1962; OMIM: 607037; p value = 2.80×10−5) encodes for enoyl-CoA, hydratase/3-hydroxyacyl CoA dehydrogenase. Myocyte-specific enhancer factor 2C also known as MADS box transcription enhancer factor 2, polypeptide C is a protein that in humans is encoded by the MEF2C gene (Gene_ID: 4208; OMIM: 600662; p value = 7.26×10−5). MEF2C upregulates the expression of the homeodomain transcription factors DLX5 and DLX6, two transcription factors that are necessary for craniofacial development [51]. This could be another interesting link to cataracts.

Several SNPs in or near ALDOB (Gene_ID: 229; OMIM: 612724; p value = 2.46×10−6), which encodes for aldolase B, fructose-bisphosphate, were also associated with cataracts in our GWAS analysis. Mutations in this gene result in an autosomal recessive disorder of fructose intolerance, and cases of cataract have been reported in the first decade of life [52]. Another interesting associated gene is MAP3K1 (Gene_ID: 4214; OMIM: 600982; p value = 1.33×10−5), a functional mitogen-activated protein kinase kinase kinase 1. Molecular signatures of MAP3K1 have been shown to be important in embryonic eyelid closure in the mouse [53]. In total, 45 SNPs were statistically significant at p<10−5 or smaller.

In the age-at-diagnosis analysis, our top hits include ACSS3 (Gene_ID: 79611; OMIM: 614356; p value = 6.39×10−7), which is acyl-CoA synthetase short-chain family member 3; EPHA4 (p value = 7.03×10−5), ephrin type-A receptor 4, which is a protein that in humans is encoded by the EPHA4 gene (Gene_ID: 2043; OMIM: 602188). This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family, along with EPHA2. EPH and EPH-related receptors have been implicated in mediating developmental events, especially in the nervous system [54].

Discussion

This study is the first genome-wide association study in age-related cataract reported in the literature. Cataract in type 2 diabetes has been investigated, and a region on chromosome 3p14.4–3p14.2 was identified in a Han Chinese population [55]. The five SNPs identified in that study do not show evidence of association in our eMERGE cataract GWAS. It is difficult to interpret these results, however, because age-related cataracts and cataracts in type 2 diabetics may be two different phenotypes, which may have disparate etiologies. In addition, our dataset does not have an overwhelming number of individuals with type 2 diabetes (see Table 1); thus, we were underpowered to explore this specific type of association. Other previously published research on gene mapping in cataracts supports a linkage region on chromosome 1 [56] and association with EPHA2 [30,31]. In our GWAS, we did not see evidence for association with EPHA2, although we did see association with EPHA4. One significant difference in this study is the phenotyping of cases and controls based on electronic health records (EHR) in population-based cohorts, rather than family-based samples. However, our study in addition to the literature supports the suggestion of cataract-susceptibility loci on chromosome 1. Replication studies and larger sample sizes are needed to validate and confirm these findings.

Although the eMERGE network has demonstrated the utility of electronic phenotyping in EHR for several traits [57-61], there are inherent challenges with this approach. For ophthalmic conditions specifically, the abundance of EHR coded information is extremely limited or, in some health systems, absent. Thus, sophisticated phenotyping strategies must be established [45,46] Still, the success of the EHR and biobank approach for association studies is unprecedented. The ability to perform multiple GWAS simultaneously with no additional genotyping is an enormous benefit [58]. Once a set of patient samples has been genotyped on a genome-wide association platform, those data can be reused for multiple additional genotype-phenotype association studies. In particular, the eMERGE network has done quite a bit of this for quantitative traits and clinical laboratory variables such as cholesterol [60], red-blood cell indices [59], and white blood cell count [57]. The additional effort is expended on creating electronic phenotyping algorithms, rather than collecting samples and genotyping. Thus, this is an enormous resource for subsequent genotype-phenotype association studies.

Future explorations of age-related cataract include validating and replicating the association results identified herein. Unfortunately, because of the sample size and limited power by stratifying cases and controls by the eMERGE site, we did not have the opportunity to replicate these findings within eMERGE. The goal is to identify a similar study population where these results can be explored. In addition, we are beginning to investigate the role of gene–gene and gene–environment interactions associated with cataracts [62]. Due to the complexity of the trait, we hypothesize that the genetic architecture will be similar to that of other complex traits: multigenic with a combination of genetic and environmental interactions.

As demonstrated by this and other studies, the beauty of using an electronic health record is the ability to reuse genotyped samples for various phenotypes. The eMERGE network has clearly demonstrated the success of this study design, and continues to demonstrate the strengths and limitations of this approach.

Acknowledgments

The eMERGE Network was initiated and funded by NHGRI, with additional funding from NIGMS through the following grants: U01HG004610 (Group Health Cooperative); U01HG004608 (Marshfield Clinic); U01HG04599 (Mayo Clinic); U01HG004609 (Northwestern University); U01HG04603 (Vanderbilt University, also serving as the Coordinating Center); U01HG006389 (Essentia Institute of Rural Health). The Northwest Institute of Medical Genetics is also supported by a State of Washington Life Sciences Discovery Fund award.

References

  1. Thylefors B, Negrel A-D. Available data on blindness. Geneva, Switzerland: World Health Organization; 1994.
  2. Black A, Wood J. Vision and falls. Clin Exp Optom. 2005; 88:212-22. [PMID: 16083415]
  3. Congdon N, O’Colmain B, Klaver CCW, Klein R, Muñoz B, Friedman DS, Kempen J, Taylor HR, Mitchell P. Causes and prevalence of visual impairment among adults in the United States. Arch Ophthalmol. 2004; 122:477-85. [PMID: 15078664]
  4. Ellwein LB, Urato CJ. Use of eye care and associated charges among the Medicare population: 1991–1998. Arch Ophthalmol. 2002; 120:804-11. [PMID: 12049587]
  5. Podgor MJ, Cassel GH, Kannel WB. Lens changes and survival in a population-based study. N Engl J Med. 1985; 313:1438-44. [PMID: 4058547]
  6. Minassian DC, Mehra V, Johnson GJ. Mortality and cataract: findings from a population-based longitudinal study. Bull World Health Organ. 1992; 70:219-23. [PMID: 1600582]
  7. West SK, Muñoz B, Istre J, Rubin GS, Friedman SM, Fried LP, Bandeen-Roche K, Schein OD. Mixed lens opacities and subsequent mortality. Arch Ophthalmol. 2000; 118:393-7. [PMID: 10721963]
  8. Wang JJ, Mitchell P, Simpson JM, Cumming RG, Smith W. Visual impairment, age-related cataract, and mortality. Arch Ophthalmol. 2001; 119:1186-90. [PMID: 11483087]
  9. Williams SL, Ferrigno L, Mora P, Rosmini F, Maraini G. Baseline cataract type and 10-year mortality in the Italian-American Case-Control Study of age-related cataract. Am J Epidemiol. 2002; 156:127-31. [PMID: 12117703]
  10. Reidy A, Minassian DC, Desai P, Vafidis G, Joseph J, Farrow S, Connolly A. Increased mortality in women with cataract: a population based follow up of the North London Eye Study. Br J Ophthalmol. 2002; 86:424-8. [PMID: 11914212]
  11. Clemons TE, Kurinij N, Sperduto RD. Associations of mortality with ocular disorders and an intervention of high-dose antioxidants and zinc in the Age-Related Eye Disease Study: AREDS Report No. 13. Arch Ophthalmol. 2004; 122:716-26. [PMID: 15136320]
  12. Knudtson MD, Klein BEK, Klein R. Age-related eye disease, visual impairment, and survival: the Beaver Dam Eye Study. Arch Ophthalmol. 2006; 124:243-9. [PMID: 16476894]
  13. Congdon N, Vingerling JR, Klein BEK, West S, Friedman DS, Kempen J, O’Colmain B, Wu S-Y, Taylor HR. Prevalence of cataract and pseudophakia/aphakia among adults in the United States. Arch Ophthalmol. 2004; 122:487-94. [PMID: 15078665]
  14. Reddy MA, Francis PJ, Berry V, Bhattacharya SS, Moore AT. Molecular genetic basis of inherited cataract and associated phenotypes. Surv Ophthalmol. 2004; 49:300-15. [PMID: 15110667]
  15. Moore AT. Understanding the molecular genetics of congenital cataract may have wider implications for age related cataract. Br J Ophthalmol. 2004; 88:2-3. [PMID: 14693758]
  16. Hejtmancik JF, Kantorow M. Molecular genetics of age-related cataract. Exp Eye Res. 2004; 79:3-9. [PMID: 15183095]
  17. Heiba IM, Elston RC, Klein BE, Klein R. Evidence for a major gene for cortical cataract. Invest Ophthalmol Vis Sci. 1995; 36:227-35. [PMID: 7822150]
  18. Heiba IM, Elston RC, Klein BE, Klein R. Genetic etiology of nuclear cataract: evidence for a major gene. Am J Med Genet. 1993; 47:1208-14. [PMID: 8291558]
  19. The Framingham Offspring Eye Study Group. Familial aggregation of lens opacities: the Framingham Eye Study and the Framingham Offspring Eye Study. Am J Epidemiol. 1994; 140:555-64. [PMID: 8067349]
  20. Hammond CJ, Duncan DD, Snieder H, De Lange M, West SK, Spector TD, Gilbert CE. The heritability of age-related cortical cataract: the twin eye study. Invest Ophthalmol Vis Sci. 2001; 42:601-5. [PMID: 11222516]
  21. Hammond CJ, Snieder H, Spector TD, Gilbert CE. Genetic and environmental factors in age-related nuclear cataracts in monozygotic and dizygotic twins. N Engl J Med. 2000; 342:1786-90. [PMID: 10853001]
  22. Iyengar SK, Klein BEK, Klein R, Jun G, Schick JH, Millard C, Liptak R, Russo K, Lee KE, Elston RC. Identification of a major locus for age-related cortical cataract on chromosome 6p12-q12 in the Beaver Dam Eye Study. Proc Natl Acad Sci USA. 2004; 101:14485-90. [PMID: 15452352]
  23. Okano Y, Asada M, Fujimoto A, Ohtake A, Murayama K, Hsiao KJ, Choeh K, Yang Y, Cao Q, Reichardt JK, Niihira S, Imamura T, Yamano T. A genetic factor for age-related cataract: identification and characterization of a novel galactokinase variant, “Osaka,” in Asians. Am J Hum Genet. 2001; 68:1036-42. [PMID: 11231902]
  24. Maraini G, Hejtmancik JF, Shiels A, Mackay DS, Aldigeri R, Jiao XD, Williams SL, Sperduto RD, Reed G. Galactokinase gene mutations and age-related cataract. Lack of association in an Italian population. Mol Vis. 2003; 9:397-400. [PMID: 12942049]
  25. Zetterberg M, Zetterberg H, Palmér M, Rymo L, Blennow K, Tasa G, Juronen E, Veromann S, Teesalu P, Karlsson J-O, Höglund K. Apolipoprotein E polymorphism in patients with cataract. Br J Ophthalmol. 2004; 88:716-8. [PMID: 15090431]
  26. Juronen E, Tasa G, Veromann S, Parts L, Tiidla A, Pulges R, Panov A, Soovere L, Koka K, Mikelsaar AV. Polymorphic glutathione S-transferases as genetic risk factors for senile cortical cataract in Estonians. Invest Ophthalmol Vis Sci. 2000; 41:2262-7. [PMID: 10892871]
  27. Tamer L, Yilmaz A, Yildirim H, Ayaz L, Ates NA, Karakas S, Oz O, Yildirim O, Atik U. N-acetyltransferase 2 phenotype may be associated with susceptibility to age-related cataract. Curr Eye Res. 2005; 30:835-9. [PMID: 16251120]
  28. Meyer D, Parkin DP, Seifart HI, Maritz JS, Engelbrecht AH, Werely CJ, Van Helden PD. NAT2 slow acetylator function as a risk indicator for age-related cataract formation. Pharmacogenetics. 2003; 13:285-9. [PMID: 12724621]
  29. Lee S-M, Tseng L-M, Li A-F, Liu H-C, Liu T-Y, Chi C-W. Polymorphism of estrogen metabolism genes and cataract. Med Hypotheses. 2004; 63:494-7. [PMID: 15288375]
  30. Shiels A, Bennett TM, Knopf HLS, Maraini G, Li A, Jiao X, Hejtmancik JF. The EPHA2 gene is associated with cataracts linked to chromosome 1p. Mol Vis. 2008; 14:2042-55. [PMID: 19005574]
  31. Jun G, Guo H, Klein BEK, Klein R, Wang JJ, Mitchell P, Miao H, Lee KE, Joshi T, Buck M, Chugha P, Bardenstein D, Klein AP, Bailey-Wilson JE, Gong X, Spector TD, Andrew T, Hammond CJ, Elston RC, Iyengar SK, Wang B. EPHA2 is associated with age-related cortical cataract in mice and humans. PLoS Genet. 2009; 5:e1000584 [PMID: 19649315]
  32. Glynn RJ, Christen WG, Manson JE, Bernheimer J, Hennekens CH. Body mass index. An independent predictor of cataract. Arch Ophthalmol. 1995; 113:1131-7. [PMID: 7661746]
  33. Hiller R, Podgor MJ, Sperduto RD, Nowroozi L, Wilson PW, D’Agostino RB, Colton T. A longitudinal study of body mass index and lens opacities. The Framingham Studies. Ophthalmology. 1998; 105:1244-50. [PMID: 9663229]
  34. Caulfield LE, West SK, Barrón Y, Cid-Ruzafa J. Anthropometric status and cataract: the Salisbury Eye Evaluation project. Am J Clin Nutr. 1999; 69:237-42. [PMID: 9989686]
  35. Schaumberg DA, Glynn RJ, Christen WG, Hankinson SE, Hennekens CH. Relations of body fat distribution and height with cataract in men. Am J Clin Nutr. 2000; 72:1495-502. [PMID: 11101477]
  36. Weintraub JM, Willett WC, Rosner B, Colditz GA, Seddon JM, Hankinson SE. A prospective study of the relationship between body mass index and cataract extraction among US women and men. Int J Obes Relat Metab Disord. 2002; 26:1588-95. [PMID: 12461675]
  37. Jacques PF, Moeller SM, Hankinson SE, Chylack LT, , Jr Rogers G, Tung W, Wolfe JK, Willett WC, Taylor A. Weight status, abdominal adiposity, diabetes, and early age-related lens opacities. Am J Clin Nutr. 2003; 78:400-5. [PMID: 12936921]
  38. Kuang T-M, Tsai S-Y, Hsu W-M, Cheng C-Y, Liu J-H, Chou P. Body mass index and age-related cataract: the Shihpai Eye Study. Arch Ophthalmol. 2005; 123:1109-14. [PMID: 16087846]
  39. Lim LS, Tai E-S, Aung T, Tay WT, Saw SM, Seielstad M, Wong TY. Relation of age-related cataract with obesity and obesity genes in an Asian population. Am J Epidemiol. 2009; 169:1267-74. [PMID: 19329528]
  40. Congdon N, Broman KW, Lai H, Munoz B, Bowie H, Gilber D, Wojciechowski R, Alston C, West SK. Nuclear cataract shows significant familial aggregation in an older population after adjustment for possible shared environmental factors. Invest Ophthalmol Vis Sci. 2004; 45:2182-6. [PMID: 15223793]
  41. Klein AP, Duggal P, Lee KE, O’Neill JA, Klein R, Bailey-Wilson JE, Klein BEK. Polygenic effects and cigarette smoking account for a portion of the familial aggregation of nuclear sclerosis. Am J Epidemiol. 2005; 161:707-13. [PMID: 15800262]
  42. Saadat M, Farvardin-Jahromi M, Saadat H. Null genotype of glutathione S-transferase M1 is associated with senile cataract susceptibility in non-smoker females. Biochem Biophys Res Commun. 2004; 319:1287-91. [PMID: 15194507]
  43. Saadat M, Farvardin-Jahromi M. Occupational sunlight exposure, polymorphism of glutathione S-transferase M1, and senile cataract risk. Occup Environ Med. 2006; 63:503-4. [PMID: 16551760]
  44. McCarty CA, Chisholm RL, Chute CG, Kullo IJ, Jarvik GP, Larson EB, Li R, Masys DR, Ritchie MD, Roden DM, Struewing JP, Wolf WA. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics. 2011; 4:13 [PMID: 21269473]
  45. Peissig PL, Rasmussen LV, Berg RL, Linneman JG, McCarty CA, Waudby C, Chen L, Denny JC, Wilke RA, Pathak J, Carrell D, Kho AN, Starren JB. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J Am Med Inform Assoc. 2012; 19:225-34. [PMID: 22319176]
  46. Rasmussen LV, Peissig PL, McCarty CA, Starren J. Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J Am Med Inform Assoc. 2012; 19:e90-5. [PMID: 21890871]
  47. Zuvich RL, Armstrong LL, Bielinski SJ, Bradford Y, Carlson CS, Crawford DC, Crenshaw AT, De Andrade M, Doheny KF, Haines JL, Hayes MG, Jarvik GP, Jiang L, Kullo IJ, Li R, Ling H, Manolio TA, Matsumoto ME, McCarty CA, McDavid AN, Mirel DB, Olson LM, Paschall JE, Pugh EW, Rasmussen LV, Rasmussen‐Torvik LJ, Turner SD, Wilke RA, Ritchie MD. Pitfalls of merging GWAS data: lessons learned in the eMERGE network and quality control procedures to maintain high data quality. Genet Epidemiol. 2011; 35:887-98. [PMID: 22125226]
  48. Turner S, Armstrong LL, Bradford Y, Carlson CS, Crawford DC, Crenshaw AT, De Andrade M, Doheny KF, Haines JL, Hayes G, Jarvik G, Jiang L, Kullo IJ, Li R, Ling H, Manolio TA, Matsumoto M, McCarty CA, McDavid AN, Mirel DB, Paschall JE, Pugh EW, Rasmussen LV, Wilke RA, Zuvich RL, Ritchie MD. Quality control procedures for genome-wide association studies. Curr Protoc Hum Genet. 2011;Chapter 1:Unit1.19.
  49. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, De Bakker PIW, Daly MJ, Sham PC. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am J Hum Genet. 2007; 81:559-75. [PMID: 17701901]
  50. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006; 38:904-9. [PMID: 16862161]
  51. Verzi MP, Agarwal P, Brown C, McCulley DJ, Schwarz JJ, Black BL. The transcription factor MEF2C is required for craniofacial development. Dev Cell. 2007; 12:645-52. [PMID: 17420000]
  52. Sitadevi C, Ramaiah Y, Askari Z. Fructose intolerance associated with congenital cataract. Report of a case. Indian J Pediatr. 1968; 35:496-8. [PMID: 5719655]
  53. Jin C, Chen J, Meng Q, Carreira V, Tam NNC, Geh E, Karyala S, Ho S-M, Zhou X, Medvedovic M, Xia Y. Deciphering gene expression program of MAP3K1 in mouse eyelid morphogenesis. Dev Biol. 2013; 374:96-107. [PMID: 23201579]
  54. Pasquale EB. Eph receptors and ephrins in cancer: bidirectional signaling and beyond. Nat Rev Cancer. 2010; 10:165-80. [PMID: 20179713]
  55. Lin H-J, Huang Y-C, Lin J-M, Wu J-Y, Chen L-A, Lin C-J, Tsui Y-P, Chen C-P, Tsai F-J. Single-nucleotide polymorphisms in chromosome 3p14.1- 3p14.2 are associated with susceptibility of type 2 diabetes with cataract. Mol Vis. 2010; 16:1206-14. [PMID: 20664687]
  56. Ionides AC, Berry V, Mackay DS, Moore AT, Bhattacharya SS, Shiels A. A locus for autosomal dominant posterior polar cataract on chromosome 1p. Hum Mol Genet. 1997; 6:47-51. [PMID: 9002669]
  57. Crosslin DR, McDavid A, Weston N, Nelson SC, Zheng X, Hart E, De Andrade M, Kullo IJ, McCarty CA, Doheny KF, Pugh E, Kho A, Hayes MG, Pretel S, Saip A, Ritchie MD, Crawford DC, Crane PK, Newton K, Li R, Mirel DB, Crenshaw A, Larson EB, Carlson CS, Jarvik GP. Genetic variants associated with the white blood cell count in 13,923 subjects in the eMERGE Network. Hum Genet [Internet]. 2011 Oct 30; Available from: http://www.ncbi.nlm.nih.gov/pubmed/22037903
  58. Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA, Bradford Y, Chai HS, Bastarache L, Zuvich R, Peissig P, Carrell D, Ramirez AH, Pathak J, Wilke RA, Rasmussen L, Wang X, Pacheco JA, Kho AN, Hayes MG, Weston N, Matsumoto M, Kopp PA, Newton KM, Jarvik GP, Li R, Manolio TA, Kullo IJ, Chute CG, Chisholm RL, Larson EB, McCarty CA, Masys DR, Roden DM, De Andrade M. Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Am J Hum Genet. 2011; 89:529-42. [PMID: 21981779]
  59. Kullo IJ, Ding K, Shameer K, McCarty CA, Jarvik GP, Denny JC, Ritchie MD, Ye Z, Crosslin DR, Chisholm RL, Manolio TA, Chute CG. Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. Am J Hum Genet. 2011; 89:131-8. [PMID: 21700265]
  60. Turner SD, Berg RL, Linneman JG, Peissig PL, Crawford DC, Denny JC, Roden DM, McCarty CA, Ritchie MD, Wilke RA. Knowledge-driven multi-locus analysis reveals gene-gene interactions influencing HDL cholesterol level in two independent EMR-linked biobanks. PLoS ONE. 2011; 6:e19586 [PMID: 21589926]
  61. Wilke RA, Berg RL, Linneman JG, Peissig P, Starren J, Ritchie MD, McCarty CA. Quantification of the clinical modifiers impacting high-density lipoprotein cholesterol in the community: Personalized Medicine Research Project. Prev Cardiol. 2010; 13:63-8. [PMID: 20377807]
  62. Pendergrass SA, Verma SS, Holzinger ER, Moore CB, Wallace J, Dudek SM, Huggins W, Kitchner T, Waudby C, Berg R, McCarty CA, Ritchie MD. Next-generation analysis of cataracts: determining knowledge driven gene-gene interactions using Biofilter, and gene-environment interactions using the PhenX Toolkit. Pac Symp Biocomput. 2013;147–58.