In addition to being one of the common forms of cancer for men, ,prostate cancer is responsible for the second highest number of cancer-related deaths for men across the United States. ,In a recent study, researchers from the Dana-Farber Cancer Institute and the Broad Institute of MIT and Harvard developed a biologically informed deep learning model to improve prostate cancer discovery by determining the genetic and molecular factors that contribute to aggressive cancer phenotypes.
Named P-NET, it differs from other predictive deep learning models through its high level of both interpretability and accuracy. The term deep learning models refers to artificial neural networks that consist of multiple layers of neurons that a given input is passed through before an output is released. Classically, deep learning networks have a tendency to act as “black box” systems; able to correctly predict outcomes from a given set of inputs (high accuracy), without revealing the inner workings that informed said prediction (low interpretability). That said, in the context of solving complex biological problems, such as determining cancer cell morphology, traditional deep learning models have proved to be inefficient and ineffective. They require immense computing power and large data sets, and often overfit to the specific data sets that they are trained on, which throttles predictive accuracy when exposed to novel data.
P-NET resolves the issues of accuracy and computational cost by combining sparse deep learning models with specialized neural network architectures designed to mimic biological systems. Whereas standard deep learning models are fully connected, meaning every neuron connects to each neuron in the subsequent layer, in sparse deep learning models, only a portion of those connections are made. P-NET distinguishes which connections should exist based on real-world biological information. By limiting the amount of connections to only those that obey encoded biological pathways and processes, P-NET is able to eliminate the need for massive training data sets, thereby minimizing the risk of overfit. Moreover, P-NET’s accuracy is further improved by the fact that it essentially takes a shortcut in learning; rather than having to create all of its connections from scratch, it starts out with a biologically inspired framework from which it can operate.
Using biologically informed neural networks also allows for a level of interpretability not present in other deep learning models. P-NET can assign importance to the biological pathways within this structure as it learns, which provides insight into the factors that contribute to its predictions. Rather than simply feeding P-NET a genomic profile as input and letting it spit out a prediction of prostate cancer likelihood and phenotype, researchers are able to dissect the neural network to implicate pathways and reactions that could contribute to prostate cancer as well.
Courtesy of: Elmarakeby, H.A., Hwang, J., Arafeh, R. et al. Biologically informed deep neural network for prostate cancer discovery. Nature (2021). https://doi.org/10.1038/s41586-021-03922-4
Though it has yet to be tested, the developers of P-NET are optimistic that this structure will be useful for predicting other cancers as well.
Elmarakeby, H.A., Hwang, J., Arafeh, R. et al. Biologically informed deep neural network for prostate cancer discovery. Nature (2021). https://doi.org/10.1038/s41586-021-03922-4