Molecular Vision 2024; 30:49-57
<http://www.molvis.org/molvis/v30/49>
Received 11 August 2023 |
Accepted 17 February 2024 |
Published 19 February 2024
Madhulatha Pantrangi,1,4 Julie Rath,1 Nicole Kaetterhenry,1 Kari Branham,2 Dana Talsness,1 James L. Weber1,3
1PreventionGenetics, part of Exact Sciences, Marshfield, WI; 2University of Michigan, Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, Ann Arbor, MI; 3University of Wisconsin, School of Medicine and Public Health, Department of Pediatrics, Madison, WI; 4Weill Cornell Medicine, Department of Pathology and Laboratory Medicine, New York, NY
Correspondence to: James Weber, 1001 N. Central Avenue, Suite 302D, Marshfield, WI 54449; Phone: (715) 316-8157; email: jlweber7@wisc.edu
RPGR pathogenic variants are the major cause of X-linked retinitis pigmentosa. Here, we report the results from 1,033 clinical DNA tests that included sequencing of RPGR. A total of 184 RPGR variants were identified: 78 pathogenic or likely pathogenic, 14 uncertain, and 92 likely benign or benign. Among the pathogenic and likely pathogenic variants, 23 were novel, and most were frameshift or nonsense mutations (87%) and enriched (67%) in RPGR exon 15 (ORF15). Identical pathogenic variants found in different families were largely on different haplotype backgrounds, indicating relatively frequent, recurrent RPGR mutations. None of the 16 mother/affected son pairs showed de novo mutations; all 16 mothers were heterozygous for the pathogenic variant. These last two observations support the occurrence of most RPGR mutations in the male germline.
Retinitis pigmentosa (RP) is the most common inherited retinal disorder. It is characterized by the progressive degeneration of the rod photoreceptors, which leads to early-onset night blindness that often starts in adolescence, and the subsequent degeneration of the cone photoreceptors, resulting in loss of central vision. RP exhibits autosomal dominant (15%–25% of cases), autosomal recessive (5%–20% of cases), X-linked (5%–15% of cases), and complex inheritance patterns. The overall incidence of RP is 1 in 3,000–7,000 [1].
X-linked RP is the most severe form of RP, with early onset and rapid progression in males [2–4]. X-linked RP is caused primarily by pathogenic variants in two genes: RP2 (10%–20% of cases) and RPGR (70%–90% of cases) [1–3,5,6]. Pathogenic variants in RPGR have also been reported to cause other non-syndromic retinal dystrophies such as cone dystrophy, cone-rod dystrophy, and macular dystrophy [7,8]. Females heterozygous for RPGR pathogenic variants display a wide variety of symptoms ranging from asymptomatic to severe disease. However, most females are mildly affected compared to males [9]. The retinitis pigmentosa GTPase regulator (RPGR) protein is located within connecting cilia of both rod and cone photoreceptors. The protein has been proposed to be involved in microtubule organization and/or transport regulation in the cilia [10].
RPGR has two major transcripts produced by alternative splicing [4,11,12]. A 19-exon transcript is the most abundant transcript in most tissues, while a 15-exon transcript is found in the retina. The 15-exon transcript is the only transcript known to be involved in X-linked RP. Over 650 pathogenic variants have been reported in the 15-exon transcript (HGMD) [2–4,6,13–18]. The majority (approximately 88%) of the reported pathogenic variants are loss-of-function variants, especially nonsense and frameshift variants.
The last exon in the 15-exon transcript (usually called ORF15) is relatively large, is extremely rich in purine in the coding strand, and contains the majority of the pathogenic RPGR variants. ORF15 is much longer than exon 15 in the 19-exon transcript, and it is difficult to sequence ORF15 using standard sequencing methods [3,19]. There are reports of three approaches being applied to sequence ORF15: Sanger sequencing [3,20–22], long-range polymerase chain reaction (PCR) followed by short-read next-generation sequencing (NGS) of the PCR products [3,23], and NGS alone [17].
We present here results of clinical sequencing of RPGR in over 1,000 patients. We characterize many RPGR sequence variants and compare variant interpretations by PreventionGenetics with those from other clinical laboratories. We also show data on test yields, and present results which shed light on the multigenerational history of pathogenic variants in this gene.
Patient ancestry information was provided by the ordering physician for just over half (55%) of the 1,033 patients tested. Of the 572 patients for which ancestry information was obtained, 69% had European or Middle Eastern ancestry, 14% had Asian ancestry, 5% had Hispanic ancestry, 4% had African ancestry, and 8% had mixed ancestry from two or more continents.
Patient specimens were either purified DNA (32%), whole blood (56%), saliva (11%), or buccal swabs (1%). Purified DNA was extracted from the specimens in the laboratories of the ordering hospitals. At PreventionGenetics, DNA was extracted from blood, saliva, and buccal specimens using a Chemagic 360i automated DNA extraction instrument (Model 2024–0030, Revvity, Baesweiler, Germany). The concentrations of all the DNA stock solutions were determined by absorbance at 260 nm. The DNA stock solutions were diluted with water to a final concentration of 15 ng/µl for sequencing.
With the exception of the purine-rich region of ORF15 (about c.2300 to c.3400 in transcript NM_001034853.2), the full coding region of each RPGR exon plus at least 10 bases of noncoding flanking DNA was sequenced using standard NGS methods. For targeted testing, standard Sanger sequencing was used.
For Sanger sequencing, PCR was used to amplify the targeted regions. After purification of the PCR products, cycle sequencing was performed using the ABI Big Dye Terminator v.3.1 kit (catalog #4337457, ThermoFisher, Grand Island, NY). Products were resolved by electrophoresis on an Applied Biosystems (Grand Island, NY) 3730 xl capillary sequencer. Cycle sequencing was performed separately in both the forward and reverse directions.
For standard NGS, patient DNA from protein-coding regions of the genome was captured using an optimized set of DNA hybridization probes. Captured DNA was sequenced using Illumina’s Reversible Dye Terminator platform (Illumina, San Diego, CA). The average coverage was ≥100×, and the minimum coverage of all RPGR regions, except ORF15, was 20×. All uncertain, likely pathogenic, and pathogenic variant calls identified via NGS that did not pass the internal quality control metrics of PreventionGenetics were confirmed by Sanger sequencing. The Sanger sequencing-confirmed variants included all of the deletion and insertion variants. For all the NGS tests, structural variants were routinely called using Exome Depth software [24]. A single structural variant originally detected via analysis of the NGS data was confirmed using standard array comparative genomic hybridization.
For all tests involving full sequencing of RPGR, including gene panels, a special, redundant Sanger sequencing method for coverage of the ORF15 purine-rich region was used in addition to NGS. Our method is a modified version of that described by Bader et al. [22]. A set of overlapping PCR amplicons was prepared, as shown in Figure 1C. Each of these amplicons was separately Sanger sequenced bidirectionally from the ends to obtain a highly redundant sequence assembly. The resulting electrophoretic traces were manually reviewed by experienced PreventionGenetics laboratory staff.
DNA testing was performed in the accredited clinical laboratory of PreventionGenetics, part of Exact Sciences (Marshfield, WI), from January 2011 to May 2022. Tests were ordered by healthcare providers, who were nearly always physicians. Here, we report on the tests for vision loss in which the entire RPGR or a portion of this gene was sequenced. Variants were interpreted using the American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines [25]. The primary guideline criteria used to interpret the RPGR variants were PVS1 (null variants), PS4 (prevalence of variant in cases versus controls), PM1 (location in mutation hotspot), and PM2 (rarity of the variants within large unaffected control populations). The secondary criteria were PP1 (cosegregation in affected family members), PP3 (computational evidence), and PP4 (family history indicates X-linked inheritance). The variant c. and p. positions were determined using transcript NM_001034853.2. All the RPGR variants detected at PreventionGenetics have been submitted to ClinVar.
All pathogenic variants that were found in multiple, apparently unrelated, male patients were analyzed. Information that was supplied on the test order form, including the patient’s family name, name of the ordering physician, physician’s location, patient’s ethnicity, and patient’s notes, was used to exclude patients that were closely related from haplotype analysis. Sequencing data were then used to identify the genotypes of benign RPGR variants. A single difference in any of the benign variants was sufficient to conclude that the haplotypes were different.
As shown in Figure 1A, two primary RPGR transcripts have been identified: a 19-exon transcript (the most abundant transcript in most tissues) and a 15-exon transcript (expressed in the retina) [4,11,12]. Exon 15 (ORF15) in the 15-exon transcript is about 10 times the length of exon 15 in the 19-exon transcript and includes a region of 1,000 nucleotides (NM_001034853.2: c.2351 to c.3350) that is extremely rich in purine in the coding strand (95% purine) and that encodes mostly glycine or glutamic acid (82% Gly or Glu). Consistent with reports in the literature [3,19], our laboratory was unable to obtain satisfactory sequence data for the ORF15 purine-rich region using either standard Sanger or NGS sequencing methods. We found that standard NGS using hybridization capture (exome sequencing) resulted in almost no coverage of the purine-rich region, and even whole-genome sequencing with PCR-free library preparation resulted in only poor coverage of this region (Figure 1B). We therefore developed a special Sanger sequencing method to sequence the purine-rich region. This method involved the use of multiple, overlapping PCR amplicons, as shown in Figure 1C, which ensured full coverage of the purine-rich region and accurate, redundant sequencing of specimens from both male and female patients.
RPGR was sequenced using several different tests available at PreventionGenetics. These primarily included a comprehensive 363-gene inherited retinal dystrophy (IRD) panel (Test #4379), an 82-gene RP panel (Test #2699), a 35-gene CRD panel (Test #10403), a 28-gene Stargardt disease panel (Test #4315), a four-gene X-linked RP panel (Test #10187), and sequencing of the full RPGR alone (Test #11013; Table 1). These are all NGS tests that use an exome backbone. Detailed descriptions of these tests are available on the PreventionGenetics website (search the site using the test numbers). For all of these tests, special Sanger sequencing of ORF15 was performed in addition to standard NGS of other exons/genes. The ordering physicians were allowed to select the test they preferred. Targeted Sanger sequencing was also frequently performed for specific RPGR variants. These targeted variant tests were typically ordered for family members of probands and are not shown in Table 1.
From January 2011 to May 2022, 1,033 vision tests with RPGR sequencing were performed. Excluding those who underwent targeted testing, 469 males and 308 females were tested. The patients whose specimens were subjected to full RPGR sequencing and who had pathogenic or likely pathogenic RPGR variants ranged in age from four to 73 years at the time of testing, with a median age of 32 years for males (65 patients) and 26 years for females (17 patients).
The proportion of the conducted tests in which pathogenic or likely pathogenic RPGR variants were found (RPGR test yield) is shown for each of the major tests in Table 1. The RPGR test yields for the IRD, CRD, and Stargardt panels were low (<3%). The RPGR single-gene sequencing and X-linked RP panel were found to have the highest test yields (39% and 58%, respectively). Apparently, the physicians who ordered the RPGR single-gene and X-linked RP panel tests suspected X-linked inheritance, presumably due to the patient’s family history. Not surprisingly, the ratio of RPGR test yield to overall test yield was highest for the RPGR single-gene and X-linked RP panel tests. About 16% of the positive RP panel tests were positive in RPGR, confirming that RPGR is one of the most frequently mutated genes in RP.
Seventeen female patients were found to be heterozygous for a pathogenic or likely pathogenic RPGR variant. In 13 of these cases, the specimens were subjected to either the X-linked RP panel test or sequencing of RPGR alone. In three of these cases, the specimens were subjected to the RP panel test, and in one case, the specimen was subjected to the IRD panel test. The physicians who ordered these tests provided highly variable clinical information: no clinical information was provided for four patients, a family history of X-linked RP was indicated for seven patients, and vision loss was indicated for six patients (“tapetal-like bilateral reflex,” “early signs of RP at 40 years,” “mild pigmentary changes,” “RP,” “personal history of vision issues,” and “diffuse chorioretinal atrophy, bony spicules, vascular attenuation, cataracts, macular hole [right eye], and decreased night and peripheral vision”). Specimens from three of the six patients with indicated vision loss were subjected to the RP or IRD panel test.
The RPGR variants interpreted at PreventionGenetics as being likely benign or benign (92 total) are listed in Appendix 1. About half (51%) of these variants were located in ORF15, and 53% of the benign and likely benign variants located in ORF15 were short in-frame deletions or insertions, compared to only 4% of those located in other exons. This indicates that there is a propensity for deletion and insertion mutations within the purine-rich ORF15.
The interpretations provided by other laboratories and deposited in ClinVar for the likely benign or benign variants are shown in Appendix 1. Among the 66 variants in ClinVar with interpretations provided by other laboratories, all but one (98%) have been interpreted as likely benign or benign by at least one other laboratory. The exception was variant c.1245+6A>G. We interpreted this variant as likely benign on the basis of the lack of a predicted effect on splicing and the presence of two male hemizygotes with this variant in the gnomAD database. The only other laboratory that deposited an interpretation in ClinVar for this variant classified it as uncertain.
Fourteen RPGR variants were interpreted by PreventionGenetics as having uncertain clinical significance, and these are shown in Appendix 2. These uncertain variants were all either rare missense variants or rare in-frame deletions or insertions. The uncertain in-frame deletions or insertions ranged in length from 3 to 120 nucleotides, and all were located in ORF15. Interpretations by other laboratories were available in ClinVar for eight of the 14 uncertain variants. For seven of the eight, at least one other laboratory had also classified the variant as uncertain. The exception was variant c.1443A>C (p.Glu481Asp), for which an interpretation of likely benign was deposited by one other laboratory.
The RPGR variants interpreted by PreventionGenetics as being likely pathogenic or pathogenic are also listed in Appendix 2. The likely pathogenic and pathogenic variants were enriched in ORF15 (68%) compared to exons 1–14 (32%). Nearly all (94%) of the pathogenic or likely pathogenic variants were loss-of-function (i.e., nonsense, frameshift, or splicing) variants. The remaining 6% (five variants) were missense variants. Within the frameshift variants, the ratio of deletions to insertions was 6:1 for the entire RPGR. This bias toward deletions was particularly pronounced in ORF15, where the ratio of deletion to insertion frameshifts was 9:1. In ClinVar, variant interpretations had been deposited by other laboratories for 44 of the 78 variants interpreted as likely pathogenic or pathogenic at PreventionGenetics. For these 44 variants, at least one other laboratory had interpreted the variant as either likely pathogenic or pathogenic.
The distribution of the ORF15 variants according to type determined at PreventionGenetics (Appendix 2) and that in the HGMD are shown in Figure 2. The distributions were found to be similar; however, the proportion of variants that contained small deletions or insertions was higher in the PreventionGenetics data than in the HGMD data. This may be due to the special Sanger sequencing method we used to cover the extremely purine-rich portion of ORF15. It is possible that other laboratories did not detect some of the short deletion or insertion variants because of poor sequence quality or low sequence coverage of this region.
Despite the fact that we routinely screened RPGR for structural variants and the relative ease of detecting structural variants in male hemizygous patients, we found only one. This complex structural variant included a 2.3 kb deletion that encompassed exons 2 and 3 in RPGR, along with an approximate 400 bp inversion of sequences adjacent to the deleted region. Notably, Tuupanen et al. [17] reported a similar rate of occurrence of RPGR structural variants. In that study, all the structural variants involved deletions, and there was one relatively small deletion that covered exons 2 and 3.
Haplotypes were generated for 29 male patients who shared an identical pathogenic variant with at least one other patient. In total, there were eight shared pathogenic variants; the number of patients with a particular pathogenic variant ranged from two to nine. The most commonly shared variants were c.2236_2237del (nine patients), c.2405_2406del (four patients), and c.3096_3097del (four patients). These three variants were among the most common reported in other recent studies [4,17,18].
Out of the 29 possible different haplotypes, we detected 21 different haplotypes (72%). Differences in the patients’ ancestries could conceivably account for the large number of different haplotypes. We did not receive ancestry information for all 29 patients; however, among the 16 whose ancestry was listed as Caucasian or European, 12 different haplotypes were found (75%).
For many of the RPGR-positive probands, targeted testing for the pathogenic or likely pathogenic variants was performed using specimens from family members. This resulted in the identification of 16 mother/affected son pairs. Maternity was confirmed in each of these 16 pairs using the PreventionGenetics identity panel of 13 multiallelic microsatellites. Among these 16 mother/affected son pairs, we detected zero de novo variants. All the mothers in these pairs were heterozygous for the pathogenic variants.
In alignment with the reports of others [3,19], we were unable to clearly sequence the purine-rich portion of ORF15 using standard sequencing methods (Figure 1B). Several groups have described using either Sanger sequencing [2,6,14,15,19,21,22] or long-range PCR followed by shearing and NGS to sequence this region [3,23]. We used special Sanger sequencing with multiple, overlapping amplicons (Figure 1C). Only one group has reported the successful use of standard NGS alone to sequence this purine-rich region [17]; however, insufficient detail was provided, which prevents understanding of how this was accomplished.
We sequenced the complete coding regions of RPGR in several different panel tests (Table 1). Unsurprisingly, the highest test yield for RPGR pathogenic variants was achieved using the X-linked RP panel and by sequencing RPGR alone. Nearly all of the specimens that returned a positive X-linked RP test result had pathogenic RPGR variants present rather than other gene variants (Table 1), confirming that RPGR is by far the most commonly mutated RP gene on the X chromosome [2,6,14,15,20]. All the specimens that returned a positive X-linked RP test result that did not have pathogenic RPGR variants had pathogenic RP2 variants.
A limitation of the clinical testing process that we used is that we often do not receive full clinical information, and sometimes no clinical information at all, for the ordered tests. We therefore cannot draw firm conclusions about phenotype/genotype relationships. Nevertheless, of the 17 females whose full RPGR sequencing yielded pathogenic or likely pathogenic variants, several clearly had a degree of vision loss. This is consistent with previous reports [9,14–17]. In contrast to many other X-linked disorders, female heterozygotes with pathogenic RPGR variants are sometimes significantly affected.
Including the one structural variant identified, we found 92 uncertain, likely pathogenic, or pathogenic RPGR variants in this study, 31 of which are novel. Thus, we have significantly expanded the spectrum of known clinically relevant RPGR variants. The spectrum of detected causative RPGR variants is consistent with that of previous reports (HGMD). About two-thirds of the pathogenic variants were located in ORF15, and loss-of-function variants—especially frameshift variants—predominated. The purine-rich ORF15 is especially prone to short deletion and insertion mutations, with deletions being the predominant type. Pathogenic large deletions or insertions involving multiple RPGR exons are relatively rare. We found only one in all the patients tested.
The variant interpretations provided by our laboratory showed excellent agreement with those of other clinical laboratories deposited in ClinVar (Appendix 1 and Appendix 2). In particular, there was considerable agreement when the categories of likely benign and benign were combined and the categories of likely pathogenic and pathogenic were combined. Nevertheless, variant interpretation involves a degree of subjectivity, and some of the differences in interpretation among laboratories are also likely due to the appearance of new information about variants since interpretations were deposited in ClinVar.
We determined that most of the unrelated patients who shared identical RPGR pathogenic variants had different haplotypes. Our estimate of the proportion of patients with different haplotypes (72%) is probably an underestimate because we did not have pedigree information for most of these patients, and it is likely that some of the patients with shared haplotypes are closely related (first- or second-degree relatives). In addition, our haplotypes only encompassed variants within RPGR, which covered a maximum of 38 kb. If we had been able to extend the haplotypes over a longer chromosomal segment, it is quite likely that some of the apparently identical haplotypes would have been found to be different.
Our haplotype results which were based on the presence of recurrent RPGR mutations, and the absence of de novo variants in the mother/affected son pairs do not prove, but are consistent with, the following scenario. New, causative RPGR mutations occur primarily in the male germline and are passed on to daughters. These daughters and other heterozygous females pass the pathogenic variants to heterozygous daughters and affected sons. The affected sons, at least in the past, reproduced at rates substantially below those of unaffected men (and women). Therefore, despite the transmission from heterozygous mothers to daughters, the causative variants usually disappeared from families within a few generations. As a result, affected males with identical pathogenic variants mostly have different haplotype backgrounds.
This scenario is supported by the presence of common pathogenic RPGR variants in patients with widely differing ancestry. For example, in a study of Chinese visually impaired patients, the most common pathogenic RPGR variants were c.2236_2237del and c.2405_2406del [18]—the two most common pathogenic variants identified in this study and also the two most common pathogenic variants listed in ClinVar. For many other monogenic disorders, the most common pathogenic variants differ widely among world populations. The c.2236_2237del and c.2405_2406del RPGR variants found around the world are therefore likely the result of recurrent mutation.
The abovementioned scenario is also supported by data from RPGR pedigrees. In nine large, multigeneration pedigrees reported by Chen et al. [26], only 15 of 53 (28%) affected males of reproductive age had children, whereas 69 of 84 (82%) females heterozygous for pathogenic variants of reproductive age had children. The same pedigrees showed that the affected males who reproduced had an average of 1.7 children, while the heterozygous females who reproduced had an average of 3.1 children. Further study of de novo RPGR mutations and additional pedigree analysis is necessary to confirm this scenario.
Given that RPGR is among the most commonly mutated genes in all RP patients [27,28], there is an urgent need to optimize the clinical sequencing of this gene. In addition, understanding the genetic basis of RP is vital to the development of RPGR-based gene therapy treatments [29–31].
Appendix 1. Benign and Likely Benign RPGR Variants Detected at PreventionGenetics.
We thank Keith Nykamp for assistance in development of the special Sanger sequencing method, Heidi Lowery for assistance with artwork, and Courtney Haessly, Hannah Resheske, Alicia Gardner, Steven Lenz, and Jennifer Yeh for manual analysis of ORF15 sequencing data.