If a woman (or non-female-identifying person with a uterus and visions of starting a family) is struggling to conceive and decides to improve their reproductive odds at an IVF clinic, they’ll likely interact with a doctor, a nurse, and a receptionist. They will probably never meet the army of trained embryologists working behind closed lab doors to collect eggs, fertilize them, and develop the embryos bound for implantation.
One of embryologists’ more time-consuming jobs is grading embryos—looking at their morphological features under a microscope and assigning a quality score. Round, even numbers of cells are good. Fractured and fragmented cells, bad. They’ll use that information to decide which embryos to implant first.
It’s more gut than science and not particularly accurate. Newer methods, like pulling off a cell to extract its DNA and test for abnormalities, called preimplantation genetic screening, provide more information. But that tacks on additional costs to an already expensive IVF cycle and requires freezing the embryos until the test results come back. Manual embryo grading may be a crude tool, but it’s noninvasive and easy for most fertility clinics to carry out. Now, scientists say, an algorithm has learned to do all that time-intensive embryo ogling even better than a human.
In new research published today in NPJ Digital Medicine, scientists at Cornell University trained an off-the-shelf Google deep learning algorithm to identify IVF embryos as either good, fair, or poor, based on the likelihood each would successfully implant. This type of AI—the same neural network that identifies faces, animals, and objects in pictures uploaded to Google’s online services—has proven adept in medical settings. It has learned to diagnose diabetic blindness and identify the genetic mutations fueling cancerous tumor growth. IVF clinics could be where it’s headed next.
“All evaluation of the embryo as it’s done today is subjective,” says Nikica Zaninovic, director of the embryology lab at Weill Cornell Medicine, where the research was conducted. In 2011, the lab installed a time-lapse imaging system inside its incubators, so its technicians could watch (and record) the embryos developing in real time. This gave them something many fertility clinics in the US do not have—videos of more than 10,000 fully anonymized embryos that could each be freeze-framed and fed into a neural network. About two years ago, Zaninovic began Googling to find an AI expert to collaborate with. He found one just across campus in Olivier Elemento, director of Weill Cornell’s Englander Institute for Precision Medicine.
For years, Elemento had been collecting all kinds of medical imaging data—MRIs, mammograms, stained slides of tumor tissue—from any colleague who would give it to him, to develop automated systems to help radiologists and pathologists do their jobs better. He’d never thought to try it with IVF but could immediately see the potential. There’s a lot going on in an embryo that’s invisible to the human eye but might not be to a computer. “It was an opportunity to automate a process that is time-consuming and prone to errors,” he says. “Which is something that’s not really been done before with human embryos.”
To judge how their neural net, nicknamed STORK, stacked up against its human counterparts, they recruited five embryologists from clinics on three continents to grade 394 embryos based on images taken from different labs. The five embryologists reached the same conclusion on only 89 embryos, less than a quarter of the total. So the researchers instituted a majority voting procedure—three out of five embryologists needed to agree to classify an embryo as good, fair, or poor. When STORK looked at the same images, it predicted the embryologist majority voting decision with 95.7 percent accuracy. The most consistent volunteer matched results only 70 percent of the time; the least, 25 percent.
For now, STORK is just a tool embryologists can upload images to and play around with on a secure website hosted by Weill Cornell. It won’t be ready for the clinic until it can pass rigorous testing that follows implanted embryos over time, to see how well the algorithm fares in real life. Elemento says the group is still finalizing the design for a trial that would do that by pitting embryologists against the AI in a small, randomized cohort. Most important is understanding if STORK actually improves outcomes—not just implantation rates but successful, full-term pregnancies. On that score, at least some embryologists are skeptical.
“All this algorithm can do is change the order of which embryos we transfer,” says Eric Forman, medical and lab director at Columbia University Fertility Center. “It needs more evidence to say it helps women get pregnant quicker and safer.” On its own, he worries that STORK might make only a small contribution to improving IVF’s success rate, while possibly inserting its own biases.
Megan Molteni covers biotechnology, medicine, and genetic privacy for WIRED.
In addition to embryo grading, the Columbia clinic uses pre-implantation genetic screening to improve patients’ odds of pregnancy. While not routine, it is offered to everyone. Forman says about 70 percent of the clinic’s IVF cycles include the blastocyst biopsy procedure, which can add a few thousand dollars to a patient’s tab. That’s why he’s most intrigued about what Elemento’s team is cooking up next. They’re training a new set of neural networks to see if they can detect chromosomal abnormalities, like the one that causes Down Syndrome. With an embryo developing under a camera’s watchful gaze, Elemento’s algorithm would monitor the feed for telltale signs of trouble. “We think the patterns of cell division we can capture with these movies could potentially carry information about these defects, which are hidden in just the snapshots,” says Elemento. They’re also looking into using the technique to predict miscarriages.
There’s plenty of room to improve the performance of IVF, and these algorithmic upgrades could make a dent—in the right circumstances. “If it could provide accurate predictions in real time with minimal risk for harm and no additional cost, then I could see the potential to implement AI like this for embryo selection,” says Forman. But there would be barriers to its adoption. Most IVF clinics in the US don’t have one of these fancy time-lapse recording systems because they’re so expensive. And there are a lot of other potential ways to improve embryo viability that could be more affordable—like tailoring hormone treatments and culturing techniques to the different kinds of infertility that women experience. In the end, though, the number one problem IVF clinics contend with is that sometimes there just aren’t enough high-quality eggs, no matter how many cycles a patient goes through. And no AI, no matter how smart, can do anything about that.