Interactive Molecular Discovery with Natural Language

06/21/2023
by   Zheni Zeng, et al.
0

Natural language is expected to be a key medium for various human-machine interactions in the era of large language models. When it comes to the biochemistry field, a series of tasks around molecules (e.g., property prediction, molecule mining, etc.) are of great significance while having a high technical threshold. Bridging the molecule expressions in natural language and chemical language can not only hugely improve the interpretability and reduce the operation difficulty of these tasks, but also fuse the chemical knowledge scattered in complementary materials for a deeper comprehension of molecules. Based on these benefits, we propose the conversational molecular design, a novel task adopting natural language for describing and editing target molecules. To better accomplish this task, we design ChatMol, a knowledgeable and versatile generative pre-trained model, enhanced by injecting experimental property information, molecular spatial knowledge, and the associations between natural and chemical languages into it. Several typical solutions including large language models (e.g., ChatGPT) are evaluated, proving the challenge of conversational molecular design and the effectiveness of our knowledge enhancement method. Case observations and analysis are conducted to provide directions for further exploration of natural-language interaction in molecular discovery.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2023

SELFormer: Molecular Representation Learning via SELFIES Language Models

Automated computational analysis of the vast chemical space is critical ...
research
02/27/2021

Generative chemical transformer: attention makes neural machine learn molecular geometric structures via text

Chemical formula is an artificial language that expresses molecules as t...
research
01/29/2023

Unifying Molecular and Textual Representations via Multi-task Language Modelling

The recent advances in neural language models have also been successfull...
research
01/04/2023

Anonymous Pattern Molecular Fingerprint and its Applications on Property Identification

Molecular fingerprints are significant cheminformatics tools to map mole...
research
09/15/2023

Mining Patents with Large Language Models Demonstrates Congruence of Functional Labels and Chemical Structures

Predicting chemical function from structure is a major goal of the chemi...
research
08/13/2022

Cloud-Based Real-Time Molecular Screening Platform with MolFormer

With the prospect of automating a number of chemical tasks with high fid...
research
05/18/2023

MolXPT: Wrapping Molecules with Text for Generative Pre-training

Generative pre-trained Transformer (GPT) has demonstrates its great succ...

Please sign up or login with your details

Forgot password? Click here to reset