Abstract
AbstractUnderstanding the interactions between a ligand and its molecular target is crucial in guiding the optimization of molecules for anyin-silicodrug-design workflow. Multiple experimental and computational methods have been developed to better understand these intermolecular interactions. With the availability of a large number of structural datasets, there is a need for developing statistical frameworks that improve upon existing physics-based solutions. Here, we report a method based on geometric deep learning that is capable of predicting the binding conformations of ligands to protein targets. A technique to generate graphical representations of protein was developed to exploit the topological and electrostatic properties of the binding region. The developed framework, based on graph neural networks, learns a statistical potential based on the distance likelihood, which is tailor-made for each ligand–target pair. This potential can be coupled with global optimization algorithms such as differential evolution to reproduce the experimental binding conformations of ligands. We show that the potential based on distance likelihood, described here, performs similarly or better than well-established scoring functions for docking and screening tasks. Overall, this method represents an example of how artificial intelligence can be used to improve structure-based drug design.Significance statementCurrent machine learning-based solutions to model protein-ligand interactions lack the level of interpretability that physics-based methods usually provide. Here, a workflow to embed protein binding surfaces as graphs was developed to serve as a viable data structure to be processed by geometric deep learning. The developed architecture based on mixture density models was employed to accurately estimate the position and conformation of the small molecule within the binding region. The likelihood-based scoring function was compared against existing physics-based alternatives, and significant performance improvements in terms of docking power, screening power and reverse screening power were observed. Taken together, the developed framework provides a platform for utilising geometric deep-learning models for interpretable prediction of protein-ligand interactions at a residue level.
Publisher
Cold Spring Harbor Laboratory