Early infantile epileptic encephalopathy (EIEE) is a devastating syndrome of intractable seizures that strike in the first months of life. According to Orphanet, it affects 1 in 50,000-100,000 births. Infants with EIEE may suffer hundreds of tonic spasms per day, both during sleep and wakefulness. The prognosis is not good. Most patients die within two years; those who survive are severely impaired.
Most cases of EIEE arise sporadically (i.e. without a family history), though autosomal recessive inheritance has also been reported. Clinical diagnosis can be challenging, since infantile epilepsy is also associated with certain brain malformations and metabolic syndromes. Genetic testing for EIEE is made challenging by the fact that upwards of 50 genes have been associated with the disorder (current gene panels test ~50-130 genes). Even so, 40% of patients tested by gene panel or exome sequencing fail to achieve a diagnosis.
Earlier this month, a group at the University of Utah published a study demonstrating the power of whole-genome sequencing as a diagnostic tool for EIEE. Their cohort comprised 14 EIEE patients (recruited in 2015-2016) who had undergone extensive prior testing without receiving a diagnosis. The probands were tested, along with their parents, by deep whole genome sequencing (~65x coverage) and comprehensive variant analysis.
Comprehensive Whole Genome Analysis
One showpiece of this study is the bioinformatics methodology the authors employed to maximize their chances of detecting causal variants. They applied well-established tools to call SNVs/indels detector (GATK) and SVs/CNVs (LUMPY/SVtyper). Because they were particularly interested in de novo mutations — present in the child but not either parent — they also utilized a reference-free k-mer analysis algorithm called RUFUS to reveal sequences present in the proband that were not found in either parent, suggestive of a de novo event.
This is important because most of the apparent de novo mutations identified by routine variant analysis tools (e.g. GATK) are not real: they’re either alignment artifacts in the child, or variants that were missed in one of the parents. By performing a k-mer analysis, you avoid the former. It’s a cool idea, really. Wish I had thought of it myself.
The authors prioritized variants using what one might call the “lowest hanging fruit” strategy. They only looked at de novo events, and their search tiers were as follows:
- Nonsynonymous coding mutations in known or candidate EIEE-associated genes
- SVs predicted to disrupt EIEE-associated genes
- Nonsynonymous coding mutations in genes not associated with EIEE
- SVs affecting genes not associated with EIEE
De Novo Mutations in Infant Epilepsy Genes
Notably, the first tier search returned a diagnosis — a pathogenic/likely pathogenic mutation in an EIEE-associated gene — for 10/14 subjects (71%). That’s close to the expected diagnostic rate, and thus not terribly surprising. What is a bit surprising is that three of these subjects reportedly had negative epilepsy gene panels prior to enrollment, which raises the question of how such obvious pathogenic mutations were missed.
One case was solved in search tier 2, when the proband was found to harbor a de novo 63-kb duplication within CDKL5. The duplication was predicted to cause a frameshift and early stop, and it was on the X-chromosome (so hemizygous) in a male patient.
Mutations in Strong Candidate Genes
Two cases were solved in search tier 3. One harbored a mutation in DEAF1 (associated with dominant intellectual disability and recessive epilepsy). The variant (p.G212S) lies in the SAND domain, in which virtually all pathogenic / likely pathogenic missense variants in ClinVar reside:
As you may have noticed from the above, the p.G212S variant is already in ClinVar, as it was reported in a different patient with dominant intellectual disability and seizures.
Another harbored a missense mutation in CAMK2G, which the authors propose as a novel EIEE gene. CAMK2G encodes the gamma subunit of the calcium/calmodulin-dependent protein kinase II complex, which plays an essential role in synaptic function. The missense variant is predicted to be damaging by about half of in silico algorithms (our lab would not apply PP3 here, but this doesn’t change the interpretation). There is one reported heterozygote for this variant in gnomAD, but read data are not available and in my opinion that’s within the margin of error. Missense variants in CAMK2G appear to be under constraint. Overall, a pretty good candidate.
A de novo balanced translocation
So now there’s only one case to solve, and for that one the authors plunge into search tier 4 (SVs not affecting EIEE genes), also called the bucket of crap. I have spent countless hours running down candidates in tier 4 for our rare disease cases with little to show for it. Unlike me, however, the authors of this study actually found something that may be relevant: a de novo balanced translocation between chromosome 2 (2p16.1) and the X-chromosome (Xq28). A 92-gene segment from chromosome X thus ends up on chromosome 2, where the authors speculate that X-inactivation may be disrupted, altering their transcription.
One of the translocated genes is MECP2, which also harbored a de novo noncoding mutation in this patient. Mutations in MECP2 can cause Rett syndrome, an X-linked dominant disorder that manifests with a rather specific phenotype. From OMIM:
Rett syndrome is a neurodevelopmental disorder that occurs almost exclusively in females. It is characterized by arrested development between 6 and 18 months of age, regression of acquired skills, loss of speech, stereotypic movements (classically of the hands), microcephaly, seizures, and mental retardation.
Rett, a Viennese pediatrician, first described Rett syndrome after observing 2 girls who exhibited the same unusual behavior who happened to be seated next to each other in the waiting room.
The patient in this study had microcephaly, seizures, and global delay, which are all prominent features of the Rett phenotype. I’d be very curious to learn if she also develops the characteristic hand movements. While not definitive, the fact that these alterations are de novo, and affect multiple highly relevant genes, suggests that these are somehow responsible for the EIEE phenotype.
Molecular Diagnosis versus Candidate Gene
In the abstract, Ostrander et al write that “the detection of a pathogenic or likely pathogenic mutation in all 14 subjects demonstrates the utility of WGA.” If I wanted to nitpick, I’d argue that they have P/LP variants in 12 patients, since CAMK2G is not yet established as an EIEE gene and the causal nature of the chr2-chrX translocation in patient 2 is not established. More accurately, the authors have uncovered a P/LP variant or strong candidate variant in all 14 patients.
The cohort in this study does not entirely represent clinical reality, particularly because only subjects for which both parents were willing and available to participate were included. That’s an ideal setup if one wants to pursue only de novo mutations. A patient with only one available parent (or no available parents) is a common situation in genetics clinic and would present considerable challenges to this type of analysis.
Whole Genome Sequencing Cost and Speed
Most geneticists recognize the power of WGS to comprehensively detect genomic variants, and to return those results relatively quickly. The cost of WGS, however, remains a significant barrier. Clinical WGS currently costs around $15,000 per trio, roughly twice the cost of clinical exome sequencing. Yet, as Ostrander et al point out, the patients in this cohort underwent a minimum of 24 diagnostic tests, costing (on average) $30,866 per patient.
Perhaps just as importantly, at least three of the patients in this study had undergone gene panel and/or exome sequencing which failed to detect a rather obvious pathogenic mutation. The authors did not explore this further, at least in the paper. Yet it emphasizes that WGS may provide faster and more comprehensive results as a diagnostic tool.