DeepSymmetry : Using 3D convolutional networks for identification of tandem repeats and internal symmetries in protein structures
Motivation: Thanks to the recent advances in structural biology, nowadays three-dimensional structures of various proteins are solved on a routine basis. A large portion of these contain structural repetitions or internal symmetries. To understand the evolution mechanisms of these proteins and how structural repetitions affect the protein function, we need to be able to detect such proteins very robustly. As deep learning is particularly suited to deal with spatially organized data, we applied it to the detection of proteins with structural repetitions. Results: We present DeepSymmetry, a versatile method based on three-dimensional (3D) convolutional networks that detects structural repetitions in proteins and their density maps. Our method is designed to identify tandem repeat proteins, proteins with internal symmetries, symmetries in the raw density maps, their symmetry order, and also the corresponding symmetry axes. Detection of symmetry axes is based on learning six-dimensional Veronese mappings of 3D vectors, and the median angular error of axis determination is less than one degree. We demonstrate the capabilities of our method on benchmarks with tandem repeated proteins and also with symmetrical assemblies. For example, we have discovered over 10,000 putative tandem repeat proteins that are not currently present in the RepeatsDB database. Availability: The method is available at https://team.inria.fr/nano-d/software/deepsymmetry. It consists of a C++ executable that transforms molecular structures into volumetric density maps, and a Python code based on the TensorFlow framework for applying the DeepSymmetry model to these maps.
READ FULL TEXT