HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data

by   Sutanay Choudhury, et al.

Intermolecular and long-range interactions are central to phenomena as diverse as gene regulation, topological states of quantum materials, electrolyte transport in batteries, and the universal solvation properties of water. We present a set of challenge problems for preserving intermolecular interactions and structural motifs in machine-learning approaches to chemical problems, through the use of a recently published dataset of 4.95 million water clusters held together by hydrogen bonding interactions and resulting in longer range structural patterns. The dataset provides spatial coordinates as well as two types of graph representations, to accommodate a variety of machine-learning practices.


page 1

page 2

page 3

page 4


SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials

Machine learning potentials are an important tool for molecular simulati...

Learning Neural Generative Dynamics for Molecular Conformation Generation

We study how to generate molecule conformations (i.e., 3D structures) fr...

Machine learning and excited-state molecular dynamics

Machine learning is employed at an increasing rate in the research field...

High throughput screening with machine learning

This study assesses the efficiency of several popular machine learning a...

ActiveRemediation: The Search for Lead Pipes in Flint, Michigan

We detail our ongoing work in Flint, Michigan to detect pipes made of le...

Multi-scale approach for the prediction of atomic scale properties

Electronic nearsightedness is one of the fundamental principles governin...

Shortcut Matrix Product States and its applications

Matrix Product States (MPS), also known as Tensor Train (TT) decompositi...

Code Repositories


HydroNet: Benchmark Tasks for Preserving Long-range Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data, at the 34th Conference on Neural Information Processing Systems (NuerIPS), Workshop on Machine Learning and the Physical Sciences [https://arxiv.org/abs/2012.00131]

view repo