SPECTRe: Substructure Processing, Enumeration, and Comparison Tool Resource: An efficient tool to encode all substructures of molecules represented in SMILES

11/05/2021
by   Yasemin Yesiltepe, et al.
0

Functional groups and moieties are chemical descriptors of biomolecules that can be used to interpret their properties and functions, leading to the understanding of chemical or biological mechanisms. These chemical building blocks, or sub-structures, enable the identification of common molecular subgroups, assessing the structural similarities and critical interactions among a set of biological molecules with known activities, and designing novel compounds with similar chemical properties. Here, we introduce a Python-based tool, SPECTRe (Substructure Processing, Enumeration, and Comparison Tool Resource), designed to provide all substructures in a given molecular structure, regardless of the molecule size, employing efficient enumeration and generation of substructures represented in a human-readable SMILES format through the use of classical graph traversal (breadth-first and depth-first search) algorithms. We demonstrate the application of SPECTRe for a set of 10,375 molecules in the molecular weight range 27 to 350 Da (<=26 non-hydrogen atoms), spanning a wide array of structure-based chemical functionalities and chemical classes. We found that the substructure count as a measure of molecular complexity depends strongly on the number of unique atom and bond types present, degree of branching, and presence of rings. The substructure counts are found to be similar for a set of molecules belonging to particular chemical classes and classified based on the characteristic features of certain topologies. We demonstrate that SPECTRe shows promise to be useful in many applications of cheminformatics such as virtual screening for drug discovery, property prediction, fingerprint-based molecular similarity searching, and data mining for identifying frequent substructures.

READ FULL TEXT

page 17

page 18

research
11/07/2021

Structure-aware generation of drug-like molecules

Structure-based drug design involves finding ligand molecules that exhib...
research
02/16/2018

Algorithmic Complexity and Reprogrammability of Chemical Structure Networks

Here we address the challenge of profiling causal properties and trackin...
research
06/10/2018

Weighted Tanimoto Coefficient for 3D Molecule Structure Similarity Measurement

Similarity searching of molecular structure has been an important applic...
research
01/23/2017

Constant Size Molecular Descriptors For Use With Machine Learning

A set of molecular descriptors whose length is independent of molecular ...
research
08/24/2023

Reconciling Inconsistent Molecular Structures from Biochemical Databases

Information on the structure of molecules, retrieved via biochemical dat...
research
09/21/2022

A data-driven interpretation of the stability of molecular crystals

Due to the subtle balance of intermolecular interactions that govern str...
research
10/29/2019

A Graph-Based Tool to Embed the π-Calculus into a Computational DPO Framework

Graph transformation approaches have been successfully used to analyse a...

Please sign up or login with your details

Forgot password? Click here to reset