Epilepsy is a broad term used to describe brain disorders that cause seizures. About 1.2% of the population has active epilepsy; in the United States this includes more than 3 million adults and 470,000 children. According to the CDC, about 1 in 10 people will have a seizure in their lifetime.
(Side note: this means it’s somewhat likely that you’ll need to help someone during or after a seizure. Please take 5 minutes to peruse the CDC’s page on first aid for seizures so that you know what to do / what not to do).
The causes of epilepsy are manifold, including strokes, tumors, infections, and injuries affecting the brain. There are also a number of genetic causes of epilepsy, hence my interest. Most currently known epilepsy genes are associated with monogenic (single gene) epilepsy disorders and were identified in family studies. These tend to be genes for developmental and epileptic encephalopathies, which tend to begin early in life and have profound effects on development.
The etiology of other forms of epilepsy — generalized epilepsy (affecting both brain hemispheres) and focal epilepsy (involving a localized region) — are more complex. Genetic studies have struggled to pinpoint the genes involved. For all three types of epilepsy, however, it’s clear that many more genes remain to be discovered.
A manuscript in press at the American Journal of Human Genetics describes the largest exome sequencing study of epilepsy conducted to date. It reports the exomes of 9,170 people with epilepsy (1,476 with developmental/epileptic encephalopathy, 4,453 with generalized epilepsy, and 5,331 with focal epilepsy) compared to 8,436 ancestry-matched controls. There’s a lot of good stuff here, especially if you’re into analysis methods (as I am). For the moment, I thought I’d highlight the results.
Analysis of Ultra-rare Variants in Epilepsy
The authors performed three types of analyses. The main analysis as indicated in the title of the paper, focused on “ultra-rare” variants observed in 3 or fewer individuals in the entire cohort and are absent from the 50,726 individuals in the DiscovEHR database of healthy [European ancestry] individuals.
Variants were annotated using VEP and a suite of other tools, with the following classes of variants considered in most analyses:
- Protein-truncating variants (PTVs), meaning nonsense, frameshift, and splice site variants
- Missense variants predicted to be damaging by in silico tools and a regional constraint score.
- Inframe insertions and deletions, which appear to be treated the same as damaging missense
- Benign missense variants that did not qualify as damaging
- Synonymous variants not predicted to change an amino acid
Excess of Ultra-rare Variants in Genes
The authors prioritized ultra-rare damaging/truncating variants across all genes (“the exome”) and performed burden tests to estimate the excess of such variants in cases relative to controls:
Protein-truncating variants (PTVs) are clearly enriched in cases relative to controls at these extremely rare frequency levels. At less stringent MAF thresholds, this enrichment is apparent but not as significant.
Next, the authors did something that I like, which was to examine the burden of ultra-rare PTVs in genes where such variants do not appear to be tolerated. They used gnomAD’s constraint metrics to identify genes under constraint for missense variants (Z>3.09) or loss-of-function variants (which PTVs usually are).
Most of the burden of urPTVs comes from genes showing strong evolutionary constraint for LOF variants (pLI>0.995). This suggests that the excess PTVs affect genes where such variants are not tolerated in the general population. It’s another way of showing that these variants are probably deleterious.
The authors estimated the excess of ultra-rare variants in numerous subsets of biologically-relevant genes, finding enrichment for damaging/PTV variants in:
- Evolutionarily “constrained” genes (i.e. pLI>0.9 or missense Z-score >3.09)
- Brain-enriched genes, defined as those expressed 2x higher in brain tissues according to GTEx data.
- GABAergic pathway genes and voltage-gated ion channel genes
Of note, most of these patterns were in the developmental/epileptic encephalopathy and generalized epilepsy patient groups. The focal epilepsy group — incidentally, also the largest group — yielded few findings, and the authors’ frustration with this outcome is not hard to suss out (or understand).
Per-Gene Burden Testing for Association
Ultra-rare variants were collapsed by gene to examine the burden of such variants in cases relative to controls. This is a fairly standard way to test for association when the variants are (by definition) too rare to reach statistical significance on their own. Basically, you consider anyone with at least one ultra-rare coding variant in a gene to be “mutated” and everyone else to be “non-mutated” for the sake of analysis.
Of note, it’s not only possible, but somewhat likely, that many such variants are de novo mutations. We don’t know for certain, since only the affected individuals and not their parents were sequenced.
There are some major statistical challenges to conducting per-gene tests on the entire exome in a large sample size. The adjustments required to minimize false-positive associations when testing so many genes simultaneously mean that the bar for statistical significance is high. In this study, SCN1A in individuals with developmental and epileptic encephalopathies (p=5.8e-08) was the only gene that met so-called exome-wide significance for association (p-value < 6.8 × 10−7 after Bonferroni correction).
A jaded view of this outcome might be that the largest exome sequencing study of epilepsy managed to find one gene, that was already the best-known and most clinically relevant epilepsy gene. But that’s hardly satisfying. Furthermore, some argue that focusing only on p-values causes one to ignore patterns that are obvious to the human eye, such as the association signal for NEXMIF.
A bit of good news is that several candidate epilepsy genes made appearances among the leading associations, including EEF1A2, GABRG2, SLC6A1, and GABRA1 for generalized epilepsy and DEPDC5 and SCN8A in focal epilepsy. This study does not make their associations definitive, but adds support to the notion that further study of those candidates is warranted.
Recessive and SKAT Analyses
The authors also performed two other types of analyses. One searched for genes consistent with recessive inheritance that were enriched in cases relative controls, and the other examined the contribution of low-frequency coding variants to epilepsy risk. Both of these used a relaxed threshold for variant inclusion, taking any with MAF<0.01. The recessive analysis found nothing, though the authors note that it was somewhat under-powered.
The SKAT burden analysis did not yield much in the way of compelling findings (surprising no one). Its top associations included known epilepsy genes like STXBP1, KCNA2, NEXMIF, and SCN1A but these failed to reach exome-wide significance. The take-home message here is that ultra-rare, possibly de novo damaging/truncating variants in constrained genes are enriched in the developmental/epileptic encephalopathy subset, and this underlies most of the gene burden in a fairly large epilepsy cohort. It supports the notion that DEE is often caused by damaging, highly penetrant mutations in single dominant genes. The essentially negative findings in generalized and focal epilepsy suggest that these types may have a more complex genetic etiology.