Figure 1 of Strom, Mol Vis 2013; 19:980-985.

Figure 1. Rare Variant Identification is Correlated With Coding Sequence Length. Coding sequence (CDS) length in nucleotides is plotted against the proportion of individuals carrying a very rare (MAF <0.1%) missense variant. A very strong positive correlation is observed (Pearson’s correlation coefficient r = 0.89), indicating that genes with long coding sequences are more likely to have a high rate of rare missense variants independent of the functional impact of those variants.