Deep Learning Could End Testing New Drugs On Animals

A new technique may help researchers find new uses for drugs, much faster

Illustration: Diana Quach
May 26, 2016 at 2:58 PM ET

Discovering a new drug is no easy task. Once scientists have isolated or synthesized a particular compound, they then have to figure out what it’s good for. To make sure it works, they conduct experiments in petri dishes, then in mice, then in larger mammals, and eventually in humans—a (pricey) gauntlet that just 6 percent of drugs pass through.

Big data analytics company In Silico Medicine wants to change all that. Using a deep learning technique that can analyze how compounds affect cells, the researchers can not only determine what new drugs could be used to treat, but also its dose and possible side effects on other parts of the body. The researchers published a proof of concept study today in the American Chemical Society journal Molecular Pharmaceutics.

Since 2012 when researchers from the University of Toronto published a foundational paper, deep learning—high-level modeling which uses layers of algorithms trained on enormous datasets—has made its way into a huge number of applications, from facial recognition software to self-driving cars and even the strategy game Go. But deep learning has mostly been applied to visual things, says Alex Zhavoronkov, the CEO of In Silico Medicine. And that was only possible because the web was chock full of millions of examples of images that could be used to train the algorithms.

The researchers behind this study gathered an enormous dataset to use deep learning in biology: They had 3 million gene transcription profiles, synopses of how the genes in a cell change how they express themselves when they’re exposed to a particular chemical compound. To gather the data, much of which came from the Broad Institute in Boston, researchers exposed different types of cancerous cells to varying doses of pharmaceutical compounds for different lengths of time. Then, they sequenced the genes of the cells to see how they changed. A drug would be effective if it inhibited or accelerated pathways in those cells to counteract disease.

The researchers first built their model with gene expression data after exposure to known pharmaceutical compounds. This data enabled the model to predict whether or not a compound worked to counteract various types of diseases. Once they had trained the model, the researchers introduced data for several compounds that the system had never seen before, but with uses known to the researchers. They wanted to see if the algorithm would correctly predict whether or not the drug was effective against a dozen different types of disease—cancer, heart disease, or conditions of the central nervous system, for example.

The algorithm got the answer right 55 percent of the time. And while that sounds pretty low, it’s still just a proof of concept—in the months since the researchers submitted this study to peer review, they have changed the basic framework of the software, and as a result the algorithm has already gotten much more precise, Zhavoronkov says. Plus, he adds, that 55 percent is still way better than human could have done.

As the model learns more of the effect of different compounds on gene expression, it can start to develop a shorthand, detecting biomarkers to indicate how well a drug would work for certain types of medical conditions. Eventually, the model will contain enough information about biomarkers to predict how drugs can affect whole tissues, illuminating their optimal dosing and side effects.

The current study doesn’t get into this, mostly because the system isn’t yet sophisticated enough. Zhavoronkov’s team at In Silico, along with a few other collaborating organizations, is working on the larger-scale models, which they are calling “almost human,” to model the gene expression of cells from all across the body. The goal is to evaluate new pharmaceutical compounds without ever testing them on a living organism.

We’re still a few years away from that, Zhavoronkov says, and even then it will still be necessary to test new compounds in petri dishes before plugging the gene expression data in the system. He and his team plan to add much more data and many more samples to make the model more sophisticated and accurate. But with a system like that to vet a compound before it’s tested on organisms, clinical trials may someday become a mere formality.