SIMPLE: Sparse Interaction Model over Peaks of moLEcules for fast, interpretable metabolite identification from tandem mass spectra

06/12/2021
by   Dai Hai Nguyen, et al.
0

Motivation: Recent success in metabolite identification from tandem mass spectra has been led by machine learning, which has two stages: mapping mass spectra to molecular fingerprint vectors and then retrieving candidate molecules from the database. In the first stage, i.e. fingerprint prediction, spectrum peaks are features and considering their interactions would be reasonable for more accurate identification of unknown metabolites. Existing approaches of fingerprint prediction are based on only individual peaks in the spectra, without explicitly considering the peak interactions. Also the current cutting-edge method is based on kernels, which are computationally heavy and difficult to interpret. Results: We propose two learning models that allow to incorporate peak interactions for fingerprint prediction. First, we extend the state-of-the-art kernel learning method by developing kernels for peak interactions to combine with kernels for peaks through multiple kernel learning (MKL). Second, we formulate a sparse interaction model for metabolite peaks, which we call SIMPLE, which is computationally light and interpretable for fingerprint prediction. The formulation of SIMPLE is convex and guarantees global optimization, for which we develop an alternating direction method of multipliers (ADMM) algorithm. Experiments using the MassBank dataset show that both models achieved comparative prediction accuracy with the current top-performance kernel method. Furthermore SIMPLE clearly revealed individual peaks and peak interactions which contribute to enhancing the performance of fingerprint prediction.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 8

research
06/28/2023

Mass Spectra Prediction with Structural Motif-based Graph Neural Networks

Mass spectra, which are agglomerations of ionized fragments from targete...
research
03/11/2023

Prefix-tree Decoding for Predicting Mass Spectra from Molecules

Computational predictions of mass spectra from molecules have enabled th...
research
11/21/2018

Predicting Electron-Ionization Mass Spectrometry using Neural Networks

When confronted with a substance of unknown identity, researchers often ...
research
10/31/2019

Peak detection for MALDI mass spectrometry imaging data using sparse frame multipliers

MALDI mass spectrometry imaging (MALDI MSI) is a spatially resolved anal...
research
03/29/2018

PIMKL: Pathway Induced Multiple Kernel Learning

Reliable identification of molecular biomarkers is essential for accurat...
research
06/12/2021

ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra

Motivation: Metabolite identification is an important task in metabolomi...
research
09/26/2017

An Atomistic Fingerprint Algorithm for Learning Ab Initio Molecular Force Fields

Molecular fingerprints, i.e. feature vectors describing atomistic neighb...

Please sign up or login with your details

Forgot password? Click here to reset