Dock2D: Synthetic data for the molecular recognition problem

12/07/2022
by   Siddharth Bhadra-Lobo, et al.
0

Predicting the physical interaction of proteins is a cornerstone problem in computational biology. New classes of learning-based algorithms are actively being developed, and are typically trained end-to-end on protein complex structures extracted from the Protein Data Bank. These training datasets tend to be large and difficult to use for prototyping and, unlike image or natural language datasets, they are not easily interpretable by non-experts. We present Dock2D-IP and Dock2D-IF, two "toy" datasets that can be used to select algorithms predicting protein-protein interactionsx2014or any other type of molecular interactions. Using two-dimensional shapes as input, each example from Dock2D-IP ("interaction pose") describes the interaction pose of two shapes known to interact and each example from Dock2D-IF ("interaction fact") describes whether two shapes form a stable complex or not. We propose a number of baseline solutions to the problem and show that the same underlying energy function can be learned either by solving the interaction pose task (formulated as an energy-minimization "docking" problem) or the fact-of-interaction task (formulated as a binding free energy estimation problem).

READ FULL TEXT

page 2

page 12

research
03/30/2017

Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity

Empirical scoring functions based on either molecular force fields or ch...
research
02/07/2022

Prompt-Guided Injection of Conformation to Pre-trained Protein Model

Pre-trained protein models (PTPMs) represent a protein with one fixed em...
research
05/08/2021

MEGADOCK-GUI: a GUI-based complete cross-docking tool for exploring protein-protein interactions

Information on protein-protein interactions (PPIs) not only advances our...
research
07/03/2018

Generalizable Protein Interface Prediction with End-to-End Learning

Predicting how proteins interact with one another - that is, which surfa...
research
01/24/2011

Finding undetected protein associations in cell signaling by belief propagation

External information propagates in the cell mainly through signaling cas...
research
01/07/2020

On-the-fly Prediction of Protein Hydration Densities and Free Energies using Deep Learning

The calculation of thermodynamic properties of biochemical systems typic...
research
11/21/2017

Training large margin host-pathogen protein-protein interaction predictors

Detection of protein-protein interactions (PPIs) plays a vital role in m...

Please sign up or login with your details

Forgot password? Click here to reset