Recent advances in the Self-Referencing Embedding Strings (SELFIES) library

02/07/2023
by   Alston Lo, et al.
0

String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100 implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of , where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of (version 2.1.1) in this manuscript.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2022

SELFIES and the future of molecular string representations

Artificial intelligence (AI) and machine learning (ML) are expanding in ...
research
12/29/2022

Matchertext: Towards Verbatim Interlanguage Embedding

Embedding text in one language within text of another is commonplace for...
research
08/21/2023

DataVinci: Learning Syntactic and Semantic String Repairs

String data is common in real-world datasets: 67.6 1.8 million real Exce...
research
11/23/2022

Group SELFIES: A Robust Fragment-Based Molecular String Representation

We introduce Group SELFIES, a molecular string representation that lever...
research
11/02/2021

From Strings to Data Science: a Practical Framework for Automated String Handling

Many machine learning libraries require that string features be converte...
research
05/30/2019

All SMILES VAE

Variational autoencoders (VAEs) defined over SMILES string and graph-bas...
research
07/03/2023

Sampling the lattice Nambu-Goto string using Continuous Normalizing Flows

Effective String Theory (EST) represents a powerful non-perturbative app...

Please sign up or login with your details

Forgot password? Click here to reset