The first ~1.5 days of the American Society of Human Genetics annual meeting were a blur but rare disease genomics and long-read sequencing are already emerging as a theme. We’re in Boston this year, a lovely city filled with many historic landmarks from American history that I will not see this week.
Effects of ~1 Million 5′ UTR Variants
The highlight of the first plenary session was a talk by Srikar Gopinath (Yale) on the functional impact of 5′ UTR variants in human disease. His lab developed a massively parallel reporting assay (NaP-TRAP) which quantifies mRNAs bound by actively-translating ribosomes. I don’t claim to understand the laboratory method or the associated machine learning, but it essentially let them test the effects of variants in 5′ UTRs on protein production. This region, aptly described as “the runway of protein synthesis”, contains numerous regulatory elements where genetic variation could have a major impact but is challenging to classify. This high-throughput system allowed the introduction and testing of thousands of variants simultaneously. The team evaluated:
- 800,000 variants in 17,824 genes (from the UK Biobank I think),
- 90,000 variants in 2,446 genes from gnomAD
- 200,000 “unobserved” theoretically possible variants expected to be lethal
In total, >1 million variants were tested in this functional system and the results yielded some fascinating insights into the effects of 5′ UTR variation on translation. As expected, many variants with high impacts on protein production were already classified as pathogenic in ClinVar. Variants that altered start codons or introduced shifts in the reading frame often had large effects. So did sequence variants at key positions of the Kozak sequence, a highly conserved motif that surrounds the start codon.

Variants at the -3 and +4 position of that sequence had significant effects. This matches what we’ve known about the Kozak sequence for decades: the consensus sequence is “defined as 5′-GCC(A/G)CCAUGG-3′, with critical positions at -3 (preferably A or G) and +4 (G) relative to the AUG codon.” In other words, this large-scale assay experimentally confirms the functional relevance of conserved DNA sequences.
Rare Diseases Need Room
Day 1 of the meeting featured a long security line for the convention center. I hiked up to the distant room hosting the session on Rare Disease Research through Collaborative Genomics only to find that it was already full to bursting. Standing room only in what appeared to be the smallest room assigned to featured symposia. Meanwhile, far less popular sessions on community engagement and imaging were less than half full. The RD session had nice talks on accelerating genomic discovery (i.e. the GREGOR consortium’s work, from Stephen Montgomery), a new method for prioritizing structural variants from long-read sequencing, and Kaitlin Samocha’s annual talk on identifying disease genes through constraint.
The real highlight was the final speaker, Ada Hamosh from Johns-Hopkins who runs the OMIM disease gene database. She made an impassioned plea to rare disease researchers to phenotype our patients at all levels, from cell to tissue to organ system. She also laid out what her team wants to see in papers that report gene-disease relationships. In their view, the ideal paper would include:
- Detailed, information-rich phenotype tables which clarify not only the presence or absence of specific phenotypes, but also indicate whether the phenotype was confirmed as absent, or not looked for, in each patient (these are different things).
- More detail than a simple plus symbol for things like brain malformations (where and what type) and seizures (how many, what type, what AED drugs were tried).
- Numerators and denominators (e.g. 5/9) rather than percentages of patients with each feature
- For each patient, a written “vignette” narrative of the life course so far and key medical concerns, such as prematurity, that could influence phenotypes
- A clear pedigree for every proband/family reported, including the segregation status of reported variant(s)
- Segregation testing and functional validation of as many of the reported variants as possible (ideally: all)
Dr. Hamosh also encouraged the audience not to lean too heavily on HPO terms. While they are useful as structured vocabulary, to a curator they can be hard to digest en masse. She likes precise, informative phenotypic descriptions for key features to be included in the abstract when possible. Her model abstract, a good paper chosen somewhat at random, had something like “Ten of twelve patients had macrocephaly (Z-scores 2.5 – 4.5)” written in the abstract. Bottom line, her presentation illustrated the significant careful work done by the OMIM curators and ways that we, as researchers, can make it easier.
Words from Francis Collins
Another quick highlight: I arrived on Tuesday in time to catch the tail-end of a talk from Francis Collins — leader of Human Genome and HapMap projects, former director of the NIH — who by his own admission stood before us as “an unemployed individual.” As you might expect, his take on the current state of affairs in the NIH and the US was a bit of a downer. But he did highlight some key breakthroughs in the field, like gene therapy for cystic fibrosis, that have happened in spite of the many headwinds. He ended by exhorting the audience:
“So, ASHG family, let us be people of action and science that bring hope to a hurting world.”
Leave a Reply