Some of the (many) recent advances in genetic testing are in the area of non-invasive prenatal testing, or NIPT. This form of genetic screening utilizes the blood of an expecting mother to screen for chromosomal abnormalities and other rare disorders in the fetus. It’s an area of intensive research at the moment, and also the subject of a high-profile article (probably paywalled) in the New York Times by Sarah Kliff and Aatish Bhatia. The story was published on New Year’s Day with the provocative title:
When They Warn of Rare Disorders, These Prenatal Tests Are Usually Wrong.
Essentially, it’s about the advent of NIPT in the United States, the companies who offer the tests, and the positive predictive value of the results. This excerpt captures the thrust of it:
In just over a decade, the tests have gone from laboratory experiments to an industry that serves more than a third of the pregnant women in America, luring major companies like Labcorp and Quest Diagnostics into the business, alongside many start-ups.
The tests initially looked for Down syndrome and worked very well. But as manufacturers tried to outsell each other, they began offering additional screenings for increasingly rare conditions.
The grave predictions made by those newer tests are usually wrong, an examination by The New York Times has found.
The NYT’s “examination” included, among other things:
- Pooling the results of five published studies on NIPT outcomes
- Conducting interviews with genetic counselors and other healthcare professionals
- Reviewing the marketing materials of several commercial NIPT offerings
- Collecting personal stories from patients who received false positive results
On the Accuracy of Rare Positive Results
Notice the subtle and careful wording used in the headline and in the excerpt above:
“When they warn of rare disorders”
“The grave predictions made”
In scientific terms, both of these phrases are referring to positive screening results, not all screening results. My response to the sensationalist claim that positive NIPT tests are often false positives? Of course they are. This is the nature of screening. I’m reminded of it every time I go through metal detectors at the airport.
The excerpt I provided is not the start of the article. This is the NYT and they know what they’re doing. The article opens with the story of a young pregnant woman whose NIPT returns a scary-sounding diagnosis for her unborn child (Prader-Willi syndrome) that turns out to be a false positive. It goes on to briefly summarize the rise of NIPT and its expansion to include increasingly rare disorders, especially ones associated with microdeletions.
The outcome, as illustrated by numerous graphics like the one at right, is that for every 15 times such tests correctly identify a genomic alteration, they are wrong 85 times. However, what none of these visually striking graphics do tell you is that for every positive result there are thousands of negative results. In other words, the overall accuracy of NIPT is extremely high.
Genomic Alteration Detection 101: Size Matters
Chromosomal abnormalities are a common cause of syndromic birth defects. The most prevalent of these, trisomy 21 (the cause of Down syndrome), has a prevalence of 1 in 664 newborns according to Smith’s Recognizable Patterns of Human Malformation (8th edition). Naturally, whole-chromosome or chromosome-arm abnormalities are also among the easiest things to detect by NIPT. Even with the technical challenges involved, such events leave massive footprints in the genome. That combination — large size and relative prevalence — is why early forms of NIPT that screened for trisomy 21 performed very well.
The term microdeletion comes from the field of cytogenetics and it’s a bit deceptively named: it refers to genomic deletions that are too small to be identified by looking at chromosomes with light microscopy. Yes, these events are more challenging to detect by NIPT than losses or gains of entire chromosomes. Even so, such microdeletions can still be huge, i.e. millions of base pairs long, and encompass dozens or hundreds of genes. In the prenatal setting, they tend to cause severe syndromic disorders. Many such disorders are extremely rare, and as noted, they can be more challenging to detect with accuracy. So yes, false positives may be more likely. However, something glossed over rather quickly by the NYT reporters is this: a positive screening result is not a diagnosis. It’s a signal that more testing should be performed.
Population Screening Benefits and Consequences
In biology in general, false positives for extremely rare things are expected because of the signal-to-noise ratio. A classic example that illustrates this is the identification of inherited versus de novo variants in a child.
Every human carries about 4-6 million sequence variants relative to the reference sequence. The vast majority of those are inherited from one’s parents. Most, in fact, are common in human populations because they arose a long time ago. In contrast, mutations that arise de novo in a child (i.e. are absent from the parents) occur at a rate of about 1e-08, making them extremely rare. On average, there are 50-70 such mutations genome-wide. When we sequence a family trio, we identify inherited variants with extremely high accuracy, i.e. >99.9%. Sure, there are a few hundred false positives, but there are many million true positives. The signal-to-noise ratio is very high. This changes when we look only for de novo mutations. Now it’s 50 true positives, 200 false positives. Very different signal to noise ratio.
However, and this is important, we don’t blithely call 200 de novo mutations in a child and assume they’re all real. With further scrutiny (e.g. using population databases and removing common artifacts) we can filter out most false positives and get to the correct number.
Positive results from NIPT also get follow-up. In the case of the woman introduced at the start of the NYT article, an amniocentesis later revealed that the fetus did not have Prader-Willi syndrome. And this, perhaps, is the biggest thing missed by the reporters: positive NIPT results are the beginning of a process. Many of those frightening positive results are later refuted by direct molecular testing or imaging studies. This does not mean we should stop screening altogether or demand that the screen yield perfect results. If it did, we wouldn’t do PSA screens or mammograms.
It’s like my airport analogy: most people who set off a metal detector are not carrying a weapon. That’s why TSA agents don’t open fire at the sound of a buzzer. When I set one off — which seems to be one of my talents in life — I empty my pockets and try again. Yet I don’t want them to stop checking people with metal detectors.
Emotional Trauma of False Positives
This is not to discount or ignore the emotional trauma that a positive screening result brings. These are very real consequences for the families affected. If you’re reading this, you’re probably close to someone who got a scary-sounding result of a medical test (as I am). Even if unconcerned, a possible diagnosis is usually terrifying.
Medical professionals take this risk quite seriously. It’s one of the major topics considered whenever population screening is discussed, one of the “costs” in a cost-benefit analysis. Yet we still do a lot of population screening because the benefits of early detection for many conditions outweigh the potential consequences.
Summary and Outlook for NIPT
I read the New York Times regularly, and I appreciate that they’re one outlet that produces in-depth articles, often about scientific/technical topics like genomics, which are backed by real reporting. I also acknowledge a lesson given to me by my AP English teacher in high school: good writing sometimes needs to take a stance on something. Yet I think this particular article has far too much of a negative slant. It does not, for example, comment on the fact that NIPT does detect true cases of genetic disorders, some of which are extremely rare. It also suggests that the only reason NIPT providers expanded their tests was to out-compete one another and make money. That they’re just another flavor of silicon valley biotech looking to make a profit.
The timing of this article is hardly accidental. The verdict of the Theranos trial (Elizabeth Holmes) has a great many people — especially the wealthy investors who lost money — feeling quite uncharitable about technology firms that over-promise on medical testing. This NYT article is written to draw on the parallels — the promises made, the misleading marketing, the billions of dollars to be made. Yet there’s at least one key distinction: NIPT actually works.
Another distinction, and a nuance probably missed by the reporters, is that the customers of the tests are not really the pregnant mothers, but the clinicians who care for them. NIPT is not a direct-to-consumer test like Ancestry or 23andMe. One reason that NIPT and other panel tests continue to expand is because medical professionals want them to. Yes, many of the conditions being tested for are individually quite rare, but collectively they are not. Furthermore, many of these conditions are actionable. Let me take one example that the reporters enjoyed highlighting in their colorful circle plot graphics.
1p36 deletion syndrome, also called monosomy 1p36, is the most commonly observed terminal deletion in the human population, with an estimated prevalence of 1 in 5,000. This syndromic condition comprises many common clinical features including growth deficiency, brain malformations, seizures, craniofacial dysmorphism, congenital heart defects, and hearing/vision problems. Almost all patients have congenital hypotonia (muscle weakness), which is associated with feeding difficulties and developmental delays. Intellectual disability is also common, but variable in severity.
Most cases of 1p36 deletion are sporadic, meaning there’s no family history. Most patients survive well into adult life, but the severity of the disease and its ultimate effects varies widely. According to the Orphanet page for this condition:
Management should be multi-disciplinary and include a regular follow-up. Early diagnosis and access to personalized rehabilitation therapies focusing on motor development, cognition, communication, and social skills are highly recommended.
This is a severe disease with lifelong medical issues, but many of them can be managed. Congenital heart defects may require surgery. Seizures can be treated with standard anti-epileptic medications. Infantile spasms are responsive to corticotrophin. Feeding and growth should be monitored, especially early in life.
There were 3.6 million babies born last year in the US alone; based on the prevalence estimate, 720 of them have 1p36 deletion. Yes, based on the false positive rate, another 4,500 could receive a false diagnosis by NIPT, but the syndrome would be clinically obvious well before birth. And there are still 720 babies who would be correctly diagnosed. Given all of the potential benefits of early/multidisciplinary intervention, it seems like a non-invasive screening test is still a good idea.