Efficiently predicting high resolution mass spectra with graph neural networks

01/26/2023
by   Michael Murphy, et al.
0

Identifying a small molecule from its mass spectrum is the primary open problem in computational metabolomics. This is typically cast as information retrieval: an unknown spectrum is matched against spectra predicted computationally from a large database of chemical structures. However, current approaches to spectrum prediction model the output space in ways that force a tradeoff between capturing high resolution mass information and tractable learning. We resolve this tradeoff by casting spectrum prediction as a mapping from an input molecular graph to a probability distribution over molecular formulas. We discover that a large corpus of mass spectra can be closely approximated using a fixed vocabulary constituting only 2 formulas. This enables efficient spectrum prediction using an architecture similar to graph classification - GrAFF-MS - achieving significantly lower prediction error and orders-of-magnitude faster runtime than state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2023

Mass Spectra Prediction with Structural Motif-based Graph Neural Networks

Mass spectra, which are agglomerations of ionized fragments from targete...
research
06/25/2021

Impact of Scene-Specific Enhancement Spectra on Matched Filter Greenhouse Gas Retrievals from Imaging Spectroscopy

Matched filter (MF) techniques have been widely used for retrieval of gr...
research
10/09/2020

Using Graph Neural Networks for Mass Spectrometry Prediction

Detecting and quantifying products of cellular metabolism using Mass Spe...
research
06/12/2021

ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra

Motivation: Metabolite identification is an important task in metabolomi...
research
03/23/2021

Automated fragment identification for electron ionisation mass spectrometry: application to atmospheric measurements of halocarbons

Background: Non-target screening consists in searching a sample for all ...
research
02/03/2019

GA-Novo: De Novo Peptide Sequencing via Tandem Mass Spectrometry using Genetic Algorithm

Proteomics is the large-scale analysis of the proteins. The common metho...
research
10/15/2021

A novel framework to quantify uncertainty in peptide-tandem mass spectrum matches with application to nanobody peptide identification

Nanobodies are small antibody fragments derived from camelids that selec...

Please sign up or login with your details

Forgot password? Click here to reset