Rare genetic diseases were one of the first beneficiaries of the high-throughput sequencing revolution. Gene discovery in rare disease was once a laborious endeavor that required gene mapping approaches — linkage analysis in dominant pedigrees, or homozygosity mapping in consanguineous families — to narrow the search space, followed by targeted sequencing of coding regions of the genes that remained. Alternatively, many genes were discovered using hypothesis-driven approaches, i.e. using our knowledge of the disease and its known pathways to identify candidate genes for sequencing.
All of that changed about a ten years ago with high-throughput “exome” sequencing, which selectively targeted the exons of all protein-coding genes in a single assay. In 2012, Canada launched the Care4Rare program to capitalize on the opportunity for rare disease patients. Kym Boycott and colleagues have just published a ten-year perspective of Care4Rare Canada in the American Journal of Human Genetics. It’s a fascinating read. The successes, challenges, and lessons learned offer some useful insights as we enter the next phase of gene discovery in rare disease.
The authors break down their decade-long experience into three project periods (called FORGE, Care4Rare, and SOLVE) each with a distinct approach to patient selection and gene discovery. In the early phases, they prioritized clinically recognizable syndromes for which the gene was not yet known, as well as families with significant histories: the same disease segregating across generations or in multiple consanguineous individuals (FORGE). They transitioned to studies of unsolved disorders in patients who had undergone limited genetic testing (Care4Rare), and eventually transitioned to the most difficult families: those with a likely genetic disorder but negative genome-wide testing (SOLVE). The outcomes reflect both the selection strategy and the state of the field in each era:
Some observations from these outcomes, most of which are highlighted by the authors:
- Diagnostic rates were high among families with limited genetic testing (eras 1-2), and almost a third of them (30-32%) had diagnoses in known disease genes.
- Still, the early phases were fruitful for gene discovery: 164 diagnoses made in novel disease genes, and 100+ discovery publications.
- Over time, the diagnostic rate has dropped while the number of candidate genes increases
- Diagnoses in known disease genes persist, even in the last era among families with negative genome-wide testing
Gene Discovery Is Getting Harder
The authors describe the first era as the “golden age” of gene discovery in rare disease. They selected the best cohorts and families, very little testing had been performed, and they uncovered a diagnosis in more than half. They produced >20 publications per year, and got one publication for every ~6 families enrolled. That truly was the golden age.
In the second era, the rate of discovery slowed. Depending on how the time is measured, their publications-per-year output dropped about 75% (from 20 to 5). Also, they now were enrolling ~21 families for each discovery publication. An interesting quirk here is that the rate of diagnoses in known genes increased. Remember, the families in era 2 had often undergone single-gene or panel testing. It’s tempting to infer that the exome was simply the superior test, but there are a number of possible explanations for a panel-negative patient who is diagnosed with a “known gene” by exome:
- The gene was discovered after the panel testing was performed
- The patient represented a new phenotype or phenotype expansion of a known disorder
- A VUS on the panel testing turned out to be pathogenic
- The patient had multiple genetic diagnoses (3.5% of cases in this study)
By my guess, the first two explanations above likely explain most known-gene diagnoses. Many genes were being discovered within this period. Further, one of Care4Rare’s goals was further delineate the genotype-phenotype spectrum of known disorders.
The third era is fundamentally different: families are enrolled only after standard-of-care genome wide sequencing (exome or genome). For each one, the authors developed a stepwise approach that begins with exome reanalysis, followed by genome, transcriptome, or deep sequencing. The choice of assay is made by an interdisciplinary team on a case-by-case basis.
Simply put, these are the hardest cases. It’s reflected in the diagnostic rate so far in the third era (14%). The good news is that only 8% of diagnoses are made in known genes. The bad news is that identifying new genes remains a struggle. As the authors put it:
Our pace of novel gene discovery slowed as RDs become more challenging to solve: our FORGE dis- coveries were from large cohorts with recognizable syndromes and a single causative gene, whereas we now tackle N-of-1 RDs with complex genetic mechanisms outside the reach of the exome.
Broadly speaking, we expect a lower diagnostic rate in modern rare disease studies simply because of ascertainment: as testing methods improve and more disease genes are identified, a greater proportion of patients should be diagnosed by standard-of-care testing. It follows that families still on a diagnostic odyssey represent more challenging cases: hard-to-detect variants, pathogenic variants in as-yet-undiscovered genes, complex inheritance, multiple genetic diagnoses, etc.
From Siloed Cohorts to International Matchmaking
Another significant change in the field during this period was the rise of international matchmaking for rare disease genes. These services connect investigators who are studying the same novel disease gene. GeneMatcher is the most widely used service, but there are at least a dozen others that aim to do the same thing. Care4Rare Canada leaned heavily into using the MatchMaker Exchange, a clearinghouse that allows users to make a single query across various connected databases. Over time, matchmaking replaced all other approaches as the primary strategy for establishing new disease genes:
In a separate study, members of the Care4Rare Consortium published the outcomes of 194 genes submitted to the MatchMaker Exchange over a 2-year period (July 2018 – July 2020). That paper almost merits its own blog post, but here are some of the highlights.
- Submission of 194 novel candidate genes (from 164 undiagnosed probands) to MME resulted in 1,514 matches.
- These were consolidated into 861 potential connections. Initial review ruled out 129, leaving 732 that required contact to request more information
- Of 732 e-mails sent, only 413 (56%) got a response. These led to collaborations for 29 of 194 genes, or 15%.
The number of matches obtained for a newly submitted gene was highly variable. Most submissions (93%) returned at least one match. Interestingly, the number of matches returned was correlated with the probability of a resulting collaboration: genes with 16+ matches were far more likely to become a collaboration (42%) than genes with 1-5 matches (8%). This is consistent with my own experience in using GeneMatcher/MME. Also, most of the successful connections (24 of 29) were made immediately after MME submission, and two-thirds of these were for syndromic intellectual disability.
Anyone who does matchmaking for gene discovery will tell you that it’s very useful, but also a lot of work. An initial submission requires only 5-10 minutes, but following up with the resulting matches takes hours and hours of effort, often over a period of months or years. The process does work, but it’s not scalable. We still have work to do.
In summary, gene discovery in rare disease has changed significantly in the past decade. The initial gold rush is over, but there are plenty more genes out there waiting to be discovered.