Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers

by   Hadi Abdine, et al.

The complex nature of big biological systems pushed some scientists to classify its understanding under the inconceivable missions. Different leveled challenges complicated this task, one of is the prediction of a protein's function. In recent years, significant progress has been made in this field through the development of various machine learning approaches. However, most existing methods formulate the task as a multi-classification problem, i.e assigning predefined labels to proteins. In this work, we propose a novel approach, Prot2Text, which predicts a protein function's in a free text style, moving beyond the conventional binary or categorical classifications. By combining Graph Neural Networks(GNNs) and Large Language Models(LLMs), in an encoder-decoder framework, our model effectively integrates diverse data types including proteins' sequences, structures, and textual annotations. This multimodal approach allows for a holistic representation of proteins' functions, enabling the generation of detailed and accurate descriptions. To evaluate our model, we extracted a multimodal protein dataset from SwissProt, and demonstrate empirically the effectiveness of Prot2Text. These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate prediction of proteins' functions. The code, the models and a demo will be publicly released.


Hierachial Protein Function Prediction with Tails-GNNs

Protein function prediction may be framed as predicting subgraphs (with ...

Neural Embeddings for Protein Graphs

Proteins perform much of the work in living organisms, and consequently ...

ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts

Current protein language models (PLMs) learn protein representations mai...

Graph Attentional Autoencoder for Anticancer Hyperfood Prediction

Recent research efforts have shown the possibility to discover anticance...

Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review

Cancer has relational information residing at varying scales, modalities...

Single-Cell Multimodal Prediction via Transformers

The recent development of multimodal single-cell technology has made the...

Please sign up or login with your details

Forgot password? Click here to reset