DeepAI AI Chat
Log In Sign Up

MetaRF: Differentiable Random Forest for Reaction Yield Prediction with a Few Trails

by   Kexin Chen, et al.
Zhejiang University
The Chinese University of Hong Kong

Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based differentiable random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology's top 10 high-yield reactions is relatively close to the results of ideal yield selection.


Multimodal Transformer-based Model for Buchwald-Hartwig and Suzuki-Miyaura Reaction Yield Prediction

Predicting the yield percentage of a chemical reaction is useful in many...

Feature-Budgeted Random Forest

We seek decision rules for prediction-time cost reduction, where complet...

A Deep Neural Network Approach for Crop Selection and Yield Prediction in Bangladesh

Agriculture is the essential ingredients to mankind which is a major sou...

HDI-Forest: Highest Density Interval Regression Forest

By seeking the narrowest prediction intervals (PIs) that satisfy the spe...

A Meta-Learning Approach to Predicting Performance and Data Requirements

We propose an approach to estimate the number of samples required for a ...

Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles

The lack of well-structured annotations in a growing amount of RNA expre...