# A novel method for inference of chemical compounds with prescribed topological substructures based on integer programming

Analysis of chemical graphs is becoming a major research topic in computational molecular biology due to its potential applications to drug design. One of the major approaches in such a study is inverse quantitative structure activity/property relationships (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a novel framework has been proposed for inverse QSAR/QSPR using both artificial neural networks (ANN) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, a feature vector f(G) of a chemical graph G is introduced and a prediction function ψ_𝒩 on a chemical property π is constructed with an ANN 𝒩. In the second phase, given a target value y^* of the chemical property π, a feature vector x^* is inferred by solving an MILP formulated from the trained ANN 𝒩 so that ψ_𝒩(x^*) is equal to y^* and then a set of chemical structures G^* such that f(G^*)= x^* is enumerated by a graph enumeration algorithm. The framework has been applied to chemical compounds with a rather abstract topological structure such as acyclic or monocyclic graphs and graphs with a specified polymer topology with cycle index up to 2. In this paper, we propose a new flexible modeling method to the framework so that we can specify a topological substructure of graphs and a partial assignment of chemical elements and bond-multiplicity to a target graph.

READ FULL TEXT
Comments

There are no comments yet.