Reconciling Inconsistent Molecular Structures from Biochemical Databases

08/24/2023
by   Casper Asbjørn Eriksen, et al.
0

Information on the structure of molecules, retrieved via biochemical databases, plays a pivotal role in various disciplines, such as metabolomics, systems biology, and drug discovery. However, no such database can be complete, and the chemical structure for a given compound is not necessarily consistent between databases. This paper presents StructRecon, a novel tool for resolving unique and correct molecular structures from database identifiers. StructRecon traverses the cross-links between database entries in different databases to construct what we call an identifier graph, which offers a more complete view of the total information available on a particular compound across all the databases. In order to reconcile discrepancies between databases, we first present an extensible model for chemical structure which supports multiple independent levels of detail, allowing standardisation of the structure to be applied iteratively. In some cases, our standardisation approach results in multiple structures for a given compound, in which case a random walk-based algorithm is used to select the most likely structure among incompatible alternates. We applied StructRecon to the EColiCore2 model, resolving a unique chemical structure for 85.11 modular, which enables the potential support for more databases in the future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2021

SimCleaner – Sistema de Padronização de Bases de Dados utilizando Funções de Similaridade

The Knowledge Discovery in Database (KDD) process permits the detection ...
research
09/22/2021

Differentiable Scaffolding Tree for Molecular Optimization

The structural design of functional molecules, also called molecular opt...
research
02/18/2018

Using 3D Hahn Moments as A Computational Representation of ATS Drugs Molecular Structure

The campaign against drug abuse is fought by all countries, most notably...
research
05/15/2020

Referencing Sources of Molecular Spectroscopic Data in the Era of Data Science: Application to the HITRAN and AMBDAS Databases

The application described has been designed to create bibliographic entr...
research
06/30/2011

Coherent Integration of Databases by Abductive Logic Programming

We introduce an abductive method for a coherent integration of independe...
research
11/17/2018

Chemical Structure Elucidation from Mass Spectrometry by Matching Substructures

Chemical structure elucidation is a serious bottleneck in analytical che...

Please sign up or login with your details

Forgot password? Click here to reset