Molecular Vision 2020; 26:378-391
Received 20 January 2020 | Accepted 19 May 2020 | Published 21 May 2020
1Department of ophthalmology, Daping Hospital of the Army Medical University, Chongqing, China; 2Radiation & Cancer Biology Laboratory, Oncology Radiotherapy Center, Chongqing University Cancer Hospital & Chongqing Cancer Institute & Chongqing Cancer Hospital, Chongqing, China
Correspondence to: Xiaolong Shi, Radiation & Cancer Biology Laboratory, Oncology Radiotherapy Center, hongqing University Cancer Hospital & Chongqing Cancer Institute & Chongqing Cancer Hospital, Chongqing 400030, China; email: firstname.lastname@example.org
Purpose: Family-based genetic linkage analysis and genome-wide association studies (GWASs) have identified many genomic loci associated with primary open-angle glaucoma (POAG). Several causative genes of POAG have been intensively analyzed by sequencing in different populations. However, few investigations have been conducted on the identification of variants of coding region in the genes identified in GWASs. Therefore, further research is needed to investigate whether they harbor pathogenically relevant rare coding variants and account for the observed association.
Methods: To identify the potentially disease-relevant variants (PDVs) in POAG-associated genes in Chinese patients, we applied molecular inversion probe (MIP)-based panel sequencing to analyze 26 candidate genes in 235 patients with POAG and 241 control subjects.
Results: The analysis identified 82 PDVs in 66 individuals across 235 patients with POAG. By comparison, only 18 PDVs in 19 control subjects were found, indicating an enrichment of PDVs in the POAG cohort (28.1% versus 7.9%, p = 8.629e-09). Among 26 candidate genes, the prevalence rate of PDVs in five genes showed a statistically significant difference between patients and controls (33 out of 235 versus 1 out of 241, p = 4.533e-10), including ATXN2 (p = 0.0033), TXNRD2 (p = 0.0190), MYOC (p = 0.0140), FOXC1 (p = 0.0140), and CDKN2B (p = 0.0287). Furthermore, two sisters harboring a stop-loss mutation EFEMP1 p.Ter494Glu were found in the POAG cohort, and further analysis of the family strongly suggested that EFEMP1 p.Ter494Glu was a potentially disease-causing mutation for POAG. A statistically significant difference in age at diagnosis between patients with PDVs and those without PDVs was found, implying that some of the identified PDVs may have a role in promoting the early onset of POAG disease.
Conclusions: The results suggest that some of the associations identified in POAG risk loci can be ascribed to rare coding variants with likely functional effects that modify POAG risk.
Glaucoma is a complex progressive neurodegenerative eye disease caused by genetic and environmental factors . It is the leading cause of irreversible blindness worldwide. There are an estimated 57.5 million people worldwide with glaucoma . It was estimated recently that there will be approximately 79.6 million people with glaucoma by 2020 [3,4]. Based on anatomic changes in the anterior chamber angle, primary glaucoma may be classified as primary angle closure glaucoma (PACG) or primary open-angle glaucoma (POAG). POAG is the most common form of glaucoma which is clinically characterized by an open and normal anterior iridocorneal chamber angle and responsible for 74% of all glaucoma cases, affecting 1% to 2% of the population over age 40 . POAG may be further subdivided into juvenile open‐angle glaucoma (JOAG) and adult onset POAG . JOAG is an uncommon subset of POAG that differs from adult-onset POAG in the age of onset and often in the magnitude of intraocular pressure (IOP) elevation.
Family pedigree studies have indicated that some portion of glaucoma is inherited in a Mendelian pattern. Family-based genetic linkage analysis has identified many genomic loci associated with POAG, including GLC1A to GLC1P, and GLC3A [6,7]. Several causative genes, such as MYOC (Gene ID 4653, OMIM 601652) [8,9], OPTN (Gene ID 10133, OMIM 602432) , WDR36 (Gene ID 134430, OMIM 609669) , CYP1B1 (Gene ID 1545, OMIM 601771) [12,13], ASB10 (Gene ID 136371, OMIM 615054) [14,15], and FOXC1 (Gene ID 2296, OMIM 601090)  were discovered. To date, only 5–10% of POAG is attributed to single-gene or Mendelian forms of glaucoma . Many of the other cases of POAG are likely due to the combined effects of several genetic and environmental risk factors. This indicates that many cases of POAG have a more complex genetic basis. In recent years, genome-wide association studies (GWASs) have been used to locate the genetic risk factors that contribute to the development of glaucoma. GWASs have demonstrated many genomic loci are associated with POAG risk, including CDKN2B-AS1 (Gene ID 100048912, OMIM 613149), TMCO1 (Gene ID 54499, OMIM 614123), CAV1 (Gene ID 857, OMIM 601047), CAV2 (Gene ID 858, OMIM 601048), SIX1 (Gene ID 6495, OMIM 601205), SIX6 (Gene ID 4990, OMIM 606326), AFAP1 (Gene ID 60312, OMIM 608252), ABCA1 (Gene ID 19, OMIM 600046), TXNRD2 (Gene ID 10587, OMIM 606448), FOXC1/GMDS (Gene ID 2762, OMIM 602884), ATXN2 (Gene ID 6311, OMIM 601517), FNDC3B (Gene ID 64778, OMIM 61190), ABO (Gene ID 28, OMIM 110300), PMM2 (Gene ID 5373, OMIM 601785), ATOH7 (Gene ID 220202, OMIM 609875), TMCO1 (Gene ID 54499, OMIM 614123), and GAS7 (Gene ID 8522, OMIM 603127) [1,6]. GWASs use single nucleotide polymorphism (SNP) markers to locate the new susceptibility gene; therefore, GWASs rarely identify functional or causal variants and cannot provide direct information on mutation patterns of susceptive genes. Mutation screening of the candidate genes is crucial for implicating the genes as the cause of a particular disease. Currently, several causative genes, such as MYOC,OPTN, WDR36, CYP1B1, ASB10, FOXC1, and PITX2 (Gene ID 5308, OMIM 601542), have been intensively analyzed by sequencing in different populations [8-12]. However, for the genes identified in GWASs, little information for variants detected in coding regions is available. Therefore, further research is needed to investigate whether these genes harbor pathogenically relevant rare coding variants and account for the observed association.
Recent advances in high-throughput sequencing technologies have dramatically improved the efficiency of screening for mutations in genes. Target sequencing for multiple candidate genes has become an important strategy for cost-effective detection of all possible mutations [17,18]. In this study, we used molecular inversion probe (MIP)-based panel sequencing technology to identify rare coding variants associated with POAG. This technology enables high-throughput sequencing for coding regions of multiple candidate genes . Furthermore, all steps of enriching the target area with an MIP can be performed in one test tube, so it is suitable for the screening of a large number of samples. In this study, a 26-gene panel were designed for analysis of the coding variants. This panel contained several POAG-causing genes, such as MYOC, OPTN, WDR36, ASB10, and CYP1B1, and many genes identified in GWASs or in association studies using the candidate-gene approach, such as ABCA1, AFAP1, ABO, ATXN2, ATOH7, CAV1, CAV2, CDKN2B (Gene ID 1030, OMIM 600431), LOXL1 (Gene ID 4016, OMIM 153456), and APOE (Gene ID 348, OMIM 107741). In addition, we included several candidate genes for research on their putative role in POAG, such as LOXL1, which has been reported with exfoliation glaucoma ; EFEMP1, which was identified by exon sequencing of families with POAG ; PAK7 (Gene ID 57144, OMIM 608038), which has shown copy number variation (CNV) in patients with POAG ; and VRK2 (Gene ID 7444, OMIM 602169), which is located in 2p15-p16 which was associated with POAG in a previous study we conducted . By applying this panel, we performed the identification of potentially disease-relevant variants (PDVs) in 235 Han Chinese patients with POAG and 241 controls. We identified numerous variants of interest for further analysis.
This study was approved by the ethics committee of Daping Hospital of the army medical university. All experimental procedures were conducted in accordance with the Association for Research in Vision and Ophthalmology (ARVO) statement on human subjects. All participants signed an informed consent form before participating in the study in accordance with the provisions of the Declaration of Helsinki. A total of 235 patients with POAG and 241 unrelated control subjects without glaucoma were recruited at the Third Affiliated Hospital (Daping Hospital) of the Army Medical University from January 2018 to December 2018. All patients were assigned a study identity number that was known only to the ophthalmologist in the researcher group. These numbers were subsequently converted to linked study identities such that others could not access patients’ private information. All human samples were linked to the study identities and barcoded, anonymized, and tracked in a centralized database, which was overseen by the researchers.
Most of the 235 patients are sporadic cases except the one set of siblings reported in this study (Appendix 1). All patients were Han Chinese. Patients with POAG were diagnosed by glaucoma specialists and fulfilled the following diagnostic criteria: open angles on gonioscopy (Grade 3 or 4 in Shaffer classification); presence of glaucomatous optic disc changes, including high cup-disc ratios (CDRs were estimated by one experienced doctor (T.L.) under ophthalmoscopy), disc ratio asymmetry between eyes (>0.2), vertical elongation of the cup, focal neuroretinal rim thinning or notching, beta-zone peripapillary atrophy; abnormal circum papillary retinal nerve fiber layer (RNFL) thickness (cpRNFLT) in at least one clockwise optical coherence tomography (OCT) scan sector between 6, 7, 8, 10, 11, and 12 o’clock (6, 5, 4, 2, 1, and 12 o’clock in the left eye), confirmed in at least three examinations; visual field abnormalities, including localized defects respecting the horizontal meridian, nasal step, arcuate scotoma; and abnormal standard automated perimetry (SAP) tests were defined as pattern standard deviation (PSD) outside 95% normal confidence limits, or glaucoma hemifield test (GHT) outside normal limits (Humphrey, Zeiss, Germany), confirmed in at least two examinations. Exclusion criteria were the following: finding of secondary glaucoma (exfoliation syndrome, etc.) and any other ocular or systemic disease that could affect the optic nerve or the visual field.
The 241 unrelated control subjects ranged in age from 38 to 78 years, average age 63.7, and were confirmed without glaucoma symptoms. The clinical details of the 235 patients and the 241 control subjects are summarized in Table 1 (for more information, also see Appendix 1 and Appendix 2).
Twenty-six candidate genes were chosen for mutation screening in the patients with POAG. The candidate genes are listed in Appendix 3, and previous studies are summarized in Appendix 3. A total of 684 MIP oligonucleotides were designed to cover 314 exons (Appendix 4). The target DNA size was about 54 K bases. The MIPs were manually designed to capture the coding regions and 20- to 50-bp flanking regions of the exons of the target genes. The general structure for the MIP is a common 41-bp linker flanked by an extension and a ligation arm of 16 to 28 bp. The extension and ligation arms were designed with Tm of 57–62 °C. The MIP targets a specific 120–140 bp genomic region for gap-fill and circularization.
Individual MIPs were column synthesized (SBS Genetech, Beijing, China). MIP capture experiments were performed with some modifications as previously described . In briefly, a total of 684 probes were pooled and phosphorylations were performed with ATP and Polynucleotide Kinase (NEB, Beijing, China). Pooled probes used as a master probe mix. The genomic DNA was sheared into 300-500 bp by Bandelin sonorex AK 102 sonicator (Bandelin, Berlin, Germany) and then MIPs were added to hybridization mixture (300 ng fragmented DNA, 1× Ampligase buffer). Hybridization procedure was conducted as fellows: 94 °C for 2 min, 54 °C for 36 h. And then gap filling-in reaction was conducted as fellows: 0.3 μl Phusion DNA polymerse (NEB), 0.6 μl Ampligase (Epicentre, Madison) and 0.1 μM dNTP (NEB) was added for filling-in the gap at 54 °C for 3 h. To degrade un-circularized DNA, 0.5 μl Exonuclease I (NEB) and 0.5 μl Exonuclease III (NEB) was added to reaction mixture at 37 °C for 45 min, and then exonucleases were inactivated at 94 °C for 4 min.
Five microliters of circularized DNA was used for amplification in a 25-μl reaction containing 0.1 μM of the truncated PE1.0 primer and the PU primer with the Illumina adaptor sequence, 0.6 U of Phusion DNA polymerase, 200 μM of each dNTP, and 1X Phusion Master Mix. The first round of PCR was performed at 98 °C for 30 s, 13 cycles of 98 °C for 10 s, 57 °C for 30 s, 72 °C for 15 s, and finally 72 °C for 5 min. Then 5-μl reaction mixture containing 0.5 μM Illumina PE1.0 primer and sample index primer was added to the first-round PCR mixture to perform an additional PCR at 98 °C for 30 s, 13 cycles of 98 °C for 10 s, 59 °C for 30 s, 72 °C for 15 s, and finally, 72 °C for 5 min (primer sequences are listed in Appendix 4). Barcoded libraries were pooled together and purified with 0.8X AMPure XP beads (Beckman Coulter, Brea, CA) to remove the primer dimers. The purified library was sent for sequencing of 2X 150-bp paired-end reads on Illumina Hiseq X-Ten (San Diego, CA).
Before alignment, low-quality reads were removed with filtering. All reads were aligned to the human reference genome (GRCh37/hg19) using the Burrow Wheeler Aligner (BWA)-Smith Waterman (SW) algorithm from BWA software version 0.7.15, with all alignment parameters set to default values. Resulting alignments were processed and converted to the BAM format with SAMtools software version 0.1.18.22. The sequence variants were called by the GATK and Picard programs (https://software.broadinstitute.org/gatk/) and further annotated by wAnnovar and Ensembl Variant Effect Predictor (VEP). Sequence alignments at variant positions were manually inspected using Integrative Genomics Viewer (IGV) software version 2.3.90. An average of about 0.2 Gb per sample (range 0.1–0.4 Gb per sample) of high-quality sequences was obtained. In this study, several optimizations were employed to improve the efficiency and uniformity of capture of the MIPs (Appendix 5). The approach achieved a high on-target specificity that about 91% of total reads can be mapped to the target regions. Targeted sequencing yielded median about 1,000-fold coverage; 98.4% of bases had >50-fold coverage. Twelve variants were selected for confirmation, and all showed agreement with the results with Sanger sequencing (Appendix 5).
We selected rare variants for further analysis that met the following criteria: i) The variant was present in sequence reads from both strands, and the sequencing depth at the variant was more than 30; ii) the variant had a minor allele frequency (MAF) <0.01 in the 1000 Genomes Project database (variants with minor allele frequency <0.03 in the East Asian populations of Exome Aggregation Consortium (ExAC_EAS) were filtered out); iii) the variant represents at least 20% of the sequence reads at a particular site; and vi) all synonymous variants were not included.
In order to identify the PDVs from the identified rare variants (Appendix 6 and Appendix 7), we adopted the analysis procedure used in the previous literature to filter variants [23,24]. The variants that occurred in the control group and did not reach a statistically significant difference (p>0.05) between the patient and control groups were excluded. Variants, such as nonsense, frame-shift, canonical ± one or two splice sites, stop codon loss, deletion, and initiation codon loss, were included . Two in silico tools, PolyPhen2  and Sorting Intolerant From Tolerant (SIFT) , which have been recommended for aiding in the interpretation of sequence variants , were used to predict potentially deleterious effects of the variants. We filtered out the missense variants that were considered to be “tolerated” or “deleterious with low confidence” by SIFT, and were considered to be “benign” or “possibly damaging” by PolyPhen2 HVAR. Missense variants predicted to be deleterious by either SIFT or PolyPhen2 HVAR were included. Two in-frame deletion mutations predicted to be neutral by SIFT-Indel were also excluded. Five variants also occurred in the Han Chinese population in the 1000 Genomes Project (including 103 individuals of Han Chinese in Beijing (HCB) and 105 South Han Chinese (SHC), Appendix 8) and did not reach statistically significant difference were also excluded. Visualization of protein lollipop structures was performed by using MutationMapper.
The frequencies of PDV carriers between patients with POAG and controls in each gene were compared with Fisher’s exact test. The clinical features of the study subjects including age at diagnosis (years), IOP, cup-disc ratio, mean RNFLT, and mean defect between the group with PDV and without PDV in the patients and controls were statistically analyzed using the Welch two-sample t test, all p values were based on two-tailed statistically analyses, and p values less than 0.05 were considered statistically significant. All analyses were performed with R version 3.5.1.
In this study, 235 patients with POAG and 241 controls were included for variant analysis. The demographic and clinical characteristics of the participants are shown in Table 1. Overall, the mean age of the control group was older than that of the patient group, considering the possible association between the onset of POAG and the age of the individuals. Control individuals were carefully checked to exclude those with any symptoms of POAG. We selected 26 candidate genes for identification of PDVs in POAG based on previous studies (Appendix 3). These candidate genes were chosen for target sequencing analysis as they were reported to be involved in the etiology of eye diseases or were reported to be associated with POAG by GWASs. We analyzed the coding regions for these candidate genes by using MIP-based target sequencing. The workflow of the MIP-based assay is shown in Figure 1.
Two filters were mainly used for identifying the PDVs: allele frequency and in silico tools’ predictions of variants. We found 180 rare variants in 143 individuals among 235 patients with POAG and 57 rare variants in 94 individuals among 241 control subjects in coding regions of 26 genes (MAF<0.01, Appendix 6 and Appendix 7). All rare variants identified are heterozygous. Twenty-two rare variants were present in patients with POAG and the controls. However, no statistically significant differences for these 22 rare variants in the allele frequencies were found between the patient and control groups (p>0.11, Fisher’s exact test). After further filtering (see Methods), we identified 82 PDVs and 18 PDVs in the patients with POAG and the controls, respectively (Table 2, detailed information was shown in Appendix 9 and Appendix 10). Of the 235 patients with POAG evaluated, 66 (28.1%) had at least one PDV in the candidate genes. In comparison, 19 of 241 subjects harbored PDVs in the control group, indicating an enrichment of PDVs in the POAG cohort (28.1% versus 7.9%, p = 8.629e-09, Fisher’s exact test; Appendix 9).
Of the 82 PDVs, there were five known deleterious missense mutations that were reported in previous studies, including p.Gln337Arg , p.Gln368Ter , and p.Pro370Leu in MYOC , p.Leu107Val in CYP1B1 , and p.Arg130Trp in PITX2 . Thus, most of the PDVs identified in this study are novel. Insertions and deletions in the gene-coding regions, stop gain or loss, and variants located at the essential splice site sequences were generally considered to be more likely pathogenic . We found eight such mutations in the POAG cohort, including three truncating mutations (p.Arg272Ter and p.Gln368Ter in MYOC, and p.Tyr848Ter in FNDC3B), four frame-shift mutations (p.(Asp677fs) in ABCA1, p.(Ser267fs) in APOE, p.(Gly77fs) in CDKN2B and p.(Gly453fs) in FOXC1), and a stop-loss mutation p.Ter494Glu in EFEMP1 in the POAG cohort. Only one control subject harbored such a deleterious mutation (ABCA1 p.Asp1894Efs) (3.4% versus 0.4%, p = 0.019, Fisher’s exact test). Of the 82 PDVs, 74 PDVs were missense PDVs. All PDVs identified in the POAG cohort have low frequencies in the population (<0.3% in ExAC_EAS and 1000 Genome Project database). Four of 82 PDVs were found in two or more patients (ATXN2 p.Arg939Gln, three cases; TXNRD2 p.Gly290Val, three cases; EFEMP1 p.Leu451Pro, two cases; PAK7 p.Gly310Glu, two cases). However, there was no statistically significant difference in the frequency for each recurrent PDV between the patient group and the control group (p>0.12, Fisher's exact test).
PDVs were identified in 21 genes among 26 candidate genes for the POAG cohort, including ATXN2 (six PDVs [7.3% of total PDVs]), ABCA1 (six [7.3%]), FOXC1 (six [7.3%]), LOXL1 (six [7.3%]), MYOC (six [7.3%]), TXNRD2 (six [7.3%]), CDKN2B (five [6.1%]), GAS7 (five [6.1%]), APOE (four [4.9%]), EFEMP1 (four [4.9%]), FNDC3B (four [4.9%]), and PITX2 (four [4.9%]; Appendix 10). Of the 26 genes analyzed, five genes (ATOH7, SIX6, VRK2, CAV1, and CAV2) had no PDVs in the POAG cohort. Among the 26 candidate genes, the prevalence rate of PDVs in five genes showed a statistically significant difference between the patients and controls, including ATXN2 (n = 8, 3.4%, p = 0.0033), TXNRD2 (n = 8, 3.4%, p = 0.0190), MYOC (n = 6, 2.6%, p = 0.0140), FOXC1 (n = 6, 2.6%, p = 0.0140), and CDKN2B (n = 5, 2.1%, p = 0.0287). Thirty-three patients harbored PDVs in these five genes, accounting for 50% of the total patients harboring PDVs. In comparison, only one subject with one PDV in these five genes was found in the control group (33/235 versus 1/241, p = 4.533e-10, Fisher’s exact test).
Some patients (17 of 66 patients) harbored more than one PDV. For example, five patients with FOXC1 PDVs also had an additional PDV in ABCA1, ABO, GAS7, and EFEMP1 or two additional PDVs in AFAP1 and LOXL1 (Appendix 1). One patient had two different PDVs in ATXN2 (p.R939Q and p.S1186I). But whether the two PDVs derived from different alleles is unclear. We analyzed the difference in diagnosis age, IOP, cup-to-disc ratio, mean defect, and mean RNFLT between patients with one PDV and patients with additional PDVs. However, no difference was found in these clinical features.
MYOC is a well-established POAG-causing gene, with a high prevalence rate in patients with POAG, as reported in a previous study . In this study, we found six patients harboring six MYOC PDVs, accounting for up to 2.6% of all cases, which is slightly lower than previously reported (2.6% versus 4.6%) . Before filtering against the variants observed in the controls, there were 31 patients harboring rare variants in MYOC (an MAF<0.01 in the 1000 Genome Project database), representing the highest rate in the POAG cohort. However, several rare variants also occurred in control subjects. For example, p.Gln19His in MYOC has been reported in a previous study , which was found in three patients in this POAG cohort and two controls. Furthermore, p.Gln19His was found in three individuals among 208 Chinese individuals in the 1000 Genome Project population (see Appendix 8). These data indicated that p.Gln19His could be considered a polymorphism in the Chinese population. Homozygous p.Arg46Ter in MYOC was reported to be pathogenic . Yoon et al. found a homozygous mutation p.Arg46Ter in a Korean patient with JOAG at age 15 years old . In that report, the homozygote with this mutation was severely affected while the father, mother, and a sister were heterozygous for the mutation, apparently without detectable symptoms. This indicated that biallelic inactivation would result in a disease phenotype. In the present study, p.Arg46Ter of MYOC occurred in seven patients with POAG and in two control subjects (p = 0.10, Fisher’s exact test). These results supported that the heterozygous p.Arg46Ter is a benign variant. Thus, only six PDVs in MYOC were identified after filtering. Among these six PDVs, a truncating mutation (p.Arg272Ter), has not been reported in a previous report. We did not find p.Arg272Ter in the EXAC_EAS or 1000 Genomes Project population. We speculated that p.Arg272Ter may be a disease-causing mutation.
In the WDR36 gene, three PDVs (p.Arg95Gly, p.Ala147Thr, p.Pro780Leu; 1.3%) were found in patients. We found that two WDR36 rare variants, p.Leu240Val and p.Ile713Val, occurred in patients and control subjects (two subjects for p.Leu240Val and three subjects for p.Ile713Val in the patient group; four subjects for p.Leu240Val and six subjects for p.Ile713Val in the control group). Thus, these two variants may be benign. Only two patients with POAG harbored OPTN PDVs (p.E230G and p.Val498Asp). This low prevalence rate is comparable with that in other studies on OPTN .
Investigating whether PDVs occur in the active sites of proteins may be useful to evaluate the influence of alternations on protein function. Locations of PDVs and domains in proteins are shown by lollipop structures (Figure 2). We found several PDVs occurred in the active site of the encoded protein. For MYOC, previous studies have demonstrated that most mutations are located in the third exon, which codes for the olfactomedin (OLF)-like domain. The OLF domain contains a high-affinity calcium binding site with a five-bladed β-propeller structure . In this study, we found that all six PDVs identified in the POAG group occurred in the OLF domain, and most were located within or near the calcium binding site. FOXC1 is a DNA-binding transcriptional factor that is involved in eye development . We found a patient harboring p.Tyr115His in FOXC1, which is located in DNA binding sites of FOXC1 (Figure 2). We deduced that p.Tyr115His may have a strong effect on DNA binding of the transcription factor, which would influence the expression of downstream target genes of FOXC1. We found a PDV (p.Cys448Phe) occurred in the LOXL1 protein lysyl oxidase domain. Cys448 may be a conserved residue because the 443–456 sites are conserved in this domain, and several copper ion binding sites (His449, His451, and H453) are located in this region. Furthermore, Cys448 and Cys497 will form a disulfide bond. The mutation p.Cys448Phe may break the S–S bond. Thus, these variants have a higher probability of damaging the functional impact. The discovery of these PDVs in active sites also provided interesting targets for engineering relevant animal models of glaucoma.
To investigate whether the PDVs influence the symptom of POAG, we analyzed the association between PDVs and clinical features of patients with POAG. There were 66 individuals with PDVs and 169 individuals without PDVs in the patient group (Table 1). The differences in the clinical features between subjects with PDVs and without PDVs were evaluated with a t test. No statistically significant difference in the IOP, cup-to-disc ratio, and mean RNFLT were observed between the two groups. We found an association between the presence of PDVs and age at diagnosis (mean of patients with PDVs = 44.8 years, mean of patients without PDVs = 55.8 years, p = 5.37e-07, Welch two-sample t test). When classifying patients by age, we found that PDVs were enriched in the subgroup of patients younger than 40 years of age (p = 1.26e-05, Fisher’s exact test). It suggests probably a fraction of the PDVs contribute to the development of POAG disease and result in an earlier age of onset for glaucoma. A slight difference in the mean defect was also found (p = 0.1349, Welch two-sample t test), indicating that the symptom of the group without PDVs is slightly more severe than that of the group with PDVs. This might be explained by the fact that the subgroup without PDVs is older.
Glaucoma is a progressive degenerative condition, and the underlying mechanisms of glaucoma are not well understood. It has been suggested that glaucoma is a complex, multifactorial disease affected by genetic factors and environment [6,36]. Discovery of the potentially disease-relevant variants will facilitate the research of POAG etiology. To discover PDVs in coding regions in a POAG cohort, we developed an MIP-based target sequencing for multiplex genes. This parallel screening for multiple target genes could provide a comparison in their prevalence rate. We found several genes with high prevalence rates that have not been reported in previous studies. Among 26 genes, TXNRD2 (n = 8, 3.4%) and ATXN2 (n = 8, 3.4%) had the highest frequency in the POAG cohort, followed by MYOC (n = 6, 2.6%), FOXC1 (n = 6, 2.6%), LOXL1 (n = 6, 2.6%), ABCA1 (n = 6, 2.6%), EFEMP1 (n = 5, 2.1%), CDKN2B (n = 5, 2.6%), and GAS7 (n = 5, 2.1%; Table 2). In particular, we found that five genes, including TXNRD2, ATXN2, FOXC1, MYOC, and CDKN2B, reached a statistically significant difference between the patient group and the control group. We found that a substantial fraction of PDVs occurred in those genes identified in GWASs, such as TXNRD2, ATXN2, FOXC1, and CDKN2B. The TXNRD2 gene encodes a mitochondrial protein that is involved in the control of reactive oxygen species levels and the regulation of mitochondrial redox homeostasis . Oxidative stress is known to cause retinal ganglion cell (RGC) dysfunction in glaucoma models and clinical glaucoma samples. TXNRD2 regulates the cellular redox environment by scavenging reactive oxygen species in mitochondria . The gene Ataxin 2 (ATXN2) encodes a membrane protein located in the endoplasmic reticulum (ER) and plasma membrane (PM). This protein is involved in endocytosis and modulates mTOR signals that modify ribosomal translation and mitochondrial function . FOXC1 is a DNA-binding transcriptional factor that plays a role in a broad range of cellular and developmental processes, such as eye, bones, cardiovascular, kidney, and skin development . Mutations in FOXC1 have been associated with Axenfeld-Rieger (AR) syndrome, a disorder characterized by anterior segment malformations in the eye and glaucoma . A GWAS was performed to identify new susceptibility loci of POAG, and three top SNPs were identified: rs35934224 [T] within TXNRD2; rs7137828 [T] within ATXN2, and rs2745572 [A] upstream of FOXC1 . This study identified new risk loci for POAG. However, whether these candidate genes harbor pathogenically relevant rare coding variants and account for the observed association require further investigation. The present data indicated a high prevalence of potentially pathogenic variants in TXNRD2 (3.4%), ATXN2 (3.4%), and FOXC1 (2.6%) in patients with POAG, providing direct evidence to support the three genes involved in POAG. Furthermore, we found a PDV p.Tyr115His is located in DNA binding sites of FOXC1 protein that would has a strong influence on protein function. Altogether, these results indicated that a fraction of PDVs may have functional impact on POAG risk loci.
This study indicated that EFEMP1 might be involved in POAG. Four PDVs of EFEMP1 were observed in six patients with POAG. Specifically, a stop-loss mutation p.Ter494Glu was found in two sisters with POAG (Figure 3). This stop-loss mutation results in the production of an abnormal protein that is 29 amino acids longer than expected, which may have an impact on its protein function. The siblings also harbored a rare variant p.Arg387Gln in EFEMP1. The siblings received a diagnosis of POAG at the age of 21 and 23 years old, respectively. Both exhibited severe visual-field loss in the right eye. Their father also had POAG and was diagnosed at 43 years old. Their mother has not shown any detectable symptoms. Further analysis showed that their father harbored p.Ter494Glu, and their mother harbored p.Arg387Gln (Figure 3). As their father has shown POAG symptoms, we deduced that p.Ter494Glu is the potentially causative mutation for this family. Their mother has p.Arg387Gln, and she does not show any POAG symptoms yet. The missense EFEMP1 p.Arg387Gln occurred in six patients and eight control subjects, suggesting it might be a normal SNP. The EFEMP1 gene was located within the GLC1H locus on chromosome 2p (2p15-p16). Although there have been a few efforts to confirm the link to GLC1H, it remains uncertain . It is expressed in the retina and RPE, involved in age-related macular degeneration (AMD) [42,43]. A GWAS identified copy number variation (CNV) at NPHP1 and EFEMP1 as potential candidates for association of inherited retinal degenerative diseases . A report located a genetic locus (GLC1H) for adult-onset POAG maps to the 2p15-p16 region with linkage analysis in an Afro-Caribbean (Jamaican) population . In a previous study, we located a region on chr2 (chr2: 46.4M-65.6M) that contributed to POAG with family-linkage analysis studies in Chinese [22,45], which is overlapped with the 2p15-p16 region. A recent study identified a novel missense variant (p.Arg140Trp) in exon 5 of the gene coding for EFEMP1 that cosegregated with POAG in an African-American family with exome sequencing . In the present POAG cohort, four EFEMP1 PDVs occurred in six patients with POAG across 235 patients with POAG, while only one subject with one PDV was found in the control group (2.6% versus 0.4%, p = 0.1184, Fisher’s exact test). No other potentially disease-relevant variant in the candidate genes was found in the siblings. The results of the family analysis suggested that EFEMP1 p.Ter494Glu is a potentially disease-causing mutation, although we cannot completely rule out the possibility that other unexamined sources of mutations contribute to POAG symptoms.
This study not only highlighted the discovery of the novel putative disease-relevant variants for POAG but also provided an effective approach for POAG research. In this study, an MIP-based panel sequencing was used to discover POAG disease-relevant variants. The panel enables automatic and low-cost detection for large POAG samples. In addition, new MIPs can be added to the panel to increase the screening regions. We developed a highly efficient protocol for a large number of samples. The cost of reagent and sequencing is about USD20 per sample for the POAG gene screening with this panel (see Appendix 11). Because MIPs can be used 10,000 times once the probes have been synthesized, the cost per sample would greatly decrease as the sample increases. This cost-saving approach has great potential for evaluating the role of each candidate gene by screening large samples and is suitable for use in a routine mutation screening workflow.
A limitation of the present study is the pathogenicity of the PDVs identified in this study mainly based on the prediction of in silico tools. Further verification with function studies, such as animal models, is required. In this study, in silico tools were used to aid in the interpretation of sequence variants. The score of the prediction algorithm is usually based on the alteration of polarity, charge, and evolution conservation of amino acids. The predicted results might provide useful information for estimating the influence of these variants on protein function. However, the prediction of these variations on protein function might be inconsistent with the real context. In general, most algorithms for missense variant prediction are 65–80% accurate when examining known disease variants . A further limitation of this study is that the sample size might not be large enough to detect all possible PDVs showing a statistically significant difference between patients and controls. Thus, these findings should be replicated in additional, larger cohorts. Therefore, additional cohort studies will be valuable for the identification of pathogenic alternations.
Appendix 1. Primary open angle glaucoma (POAG) samples information and the potentially disease-relevant variants (PDVs) identified in POAG patients.
Appendix 2. Control samples information and PDVs identified in control subjects.
Appendix 3. The list of 26 candidate genes for POAG in this study.
Appendix 4. Primers and molecular inversion probes (MIPs) sequences.
Appendix 5. Optimization of MIPs based enrichment and Sanger sequencing confirmation.
Appendix 6. Rare variants identified in POAG cohort.
Appendix 7. Rare variants identified in control subjects.
Appendix 8. Rare variants identified in 1000-genome project (CHB and CHS).
Appendix 9. PDVs in POAG and control with minor allele frequency (MAF) and prediction information.
Appendix 10. PDVs identified in POAG as compared with control.
Appendix 11. Cost estimation for panel-sequencing.
The work is supported by the project (No. 81670889) from the National Natural Science Foundation of China, the project (No. cstc2015jcyjBX0051) from the Natural Science Foundation of Chongqing, the project (Platform Enhancement of Radiation & Cancer Biology Laboratory) from Special funds for guiding local scientific and technological development by the central government of China, the project (Integrated innovation and application of key technologies for precise prevention and treatment of primary lung cancer) from Chongqing Municipal Health Committee, and the project (Technology platform construction of next generation sequencing and research on clinical translation) from Chongqing Cancer Institute. The funders only provided financial support and did not have any additional role in the study design, data collection and analysis, decision to publish, or manuscript preparation. Ethics approval and consent to participate: All patients gave their informed consent for their anonymized data to be submitted for audit and publication. The Ethics committee of the Third Affiliated Hospital (Daping Hospital) of the Army Medical University and the Research Institute of Surgery of the Academy of Military Medical Science (Medical research ethics approval No.49 (2017)). Dr. Tang (email@example.com) and Dr. Shi (firstname.lastname@example.org) are co-corresponding authors for this paper.