b'Kyunghyun Cho'

research

∙ 09/13/2023

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Most interpretability research in NLP focuses on understanding the behav...

0 Angelica Chen, et al. ∙

research

∙ 09/04/2023

Blind Biological Sequence Denoising with Self-Supervised Set Learning

Biological sequence analysis relies on the ability to denoise the imprec...

0 Nathan Ng, et al. ∙

research

∙ 08/18/2023

Latent State Models of Training Dynamics

The impact of randomness on model training is poorly understood. How do ...

0 Michael Y. Hu, et al. ∙

research

∙ 08/18/2023

Active and Passive Causal Inference Learning

This paper serves as a starting point for machine learning researchers, ...

0 Daniel Jiwoong Im, et al. ∙

research

∙ 08/11/2023

Improving Joint Speech-Text Representations Without Alignment

The last year has seen astonishing progress in text-prompted image gener...

0 Cal Peyser, et al. ∙

research

∙ 07/13/2023

Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section

Recent advances in large language models have led to renewed interest in...

0 Hongyi Zheng, et al. ∙

research

∙ 06/23/2023

System-Level Natural Language Feedback

Natural language (NL) feedback contains rich information about the user ...

0 Weizhe Yuan, et al. ∙

research

∙ 06/23/2023

On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis

Magnetic Resonance Imaging (MRI) is considered the gold standard of medi...

0 Divyam Madaan, et al. ∙

research

∙ 06/01/2023

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

Many scientific and industrial applications require joint optimization o...

4 Ji-won Park, et al. ∙

research

∙ 05/31/2023

Protein Design with Guided Discrete Diffusion

A popular approach to protein design is to combine a generative model wi...

3 Nate Gruver, et al. ∙

research

∙ 05/23/2023

Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs

Large language models (LLMs) have achieved widespread success on a varie...

5 Angelica Chen, et al. ∙

research

∙ 05/11/2023

Towards Understanding and Improving GFlowNet Training

Generative flow networks (GFlowNets) are a family of algorithms that lea...

2 Max W. Shen, et al. ∙

research

∙ 04/19/2023

A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale

Unpaired text and audio injection have emerged as dominant methods for i...

3 Cal Peyser, et al. ∙

research

∙ 03/28/2023

Training Language Models with Language Feedback at Scale

Pretrained language models often generate outputs that are not in line w...

1 Jérémy Scheurer, et al. ∙

research

∙ 03/28/2023

Improving Code Generation by Training with Natural Language Feedback

The potential for pre-trained large language models (LLMs) to use natura...

1 Angelica Chen, et al. ∙

research

∙ 02/08/2023

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

Despite the recent success of stochastic gradient descent in deep learni...

0 Cheolhyoung Lee, et al. ∙

research

∙ 01/11/2023

Dual Learning for Large Vocabulary On-Device ASR

Dual learning is a paradigm for semi-supervised machine learning that se...

1 Cal Peyser, et al. ∙

research

∙ 12/20/2022

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

Task-oriented dialogue (TOD) systems are mainly based on the slot-fillin...

2 Sang-Woo Lee, et al. ∙

research

∙ 11/20/2022

Joint Embedding Predictive Architectures Focus on Slow Features

Many common methods for learning a world model for pixel-based environme...

0 Vlad Sobal, et al. ∙

research

∙ 11/13/2022

Language Model Classifier Aligns Better with Physician Word Sensitivity than XGBoost on Readmission Prediction

Traditional evaluation metrics for classification in natural language pr...

8 Grace Yang, et al. ∙

research

∙ 10/19/2022

A Pareto-optimal compositional energy-based model for sampling and optimization of protein sequences

Deep generative models have emerged as a popular machine learning-based ...

8 Natasa Tagasovska, et al. ∙

research

∙ 10/11/2022

HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea

Historical records in Korea before the 20th century were primarily writt...

18 Haneul Yoo, et al. ∙

research

∙ 10/08/2022

PropertyDAG: Multi-objective Bayesian optimization of partially ordered, mixed-variable properties for biological sequence design

Bayesian optimization offers a sample-efficient framework for navigating...

11 Ji-won Park, et al. ∙

research

∙ 10/03/2022

A Non-monotonic Self-terminating Language Model

Recent large-scale neural autoregressive sequence models have shown impr...

12 Eugene Choi, et al. ∙

research

∙ 08/28/2022

Towards Disentangled Speech Representations

The careful construction of audio representations has become a dominant ...

4 Cal Peyser, et al. ∙

research

∙ 07/05/2022

Predicting Out-of-Domain Generalization with Local Manifold Smoothness

Understanding how machine learning models generalize to new environments...

0 Nathan Ng, et al. ∙

research

∙ 06/27/2022

Endowing Language Models with Multimodal Knowledge Graph Representations

We propose a method to make natural language understanding models more p...

12 Ningyuan Huang, et al. ∙

research

∙ 05/24/2022

Linear Connectivity Reveals Generalization Strategies

It is widely accepted in the mode connectivity literature that when two ...

3 Jeevesh Juneja, et al. ∙

research

∙ 05/20/2022

Translating Hanja historical documents to understandable Korean and English

The Annals of Joseon Dynasty (AJD) contain the daily records of the King...

8 Juhee Son, et al. ∙

research

∙ 05/09/2022

Multi-segment preserving sampling for deep manifold sampler

Deep generative modeling for biological sequences presents a unique chal...

5 Daniel Berenberg, et al. ∙

research

∙ 04/29/2022

Training Language Models with Natural Language Feedback

Pretrained language models often do not perform tasks in ways that are i...

7 Jérémy Scheurer, et al. ∙

research

∙ 04/28/2022

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Many recent studies on large-scale language models have reported success...

2 Seongjin Shin, et al. ∙

research

∙ 04/14/2022

Separating the World and Ego Models for Self-Driving

Training self-driving systems to be robust to the long-tail of driving s...

0 Vlad Sobal, et al. ∙

research

∙ 02/10/2022

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

We hypothesize that due to the greedy nature of learning in multi-modal ...

0 Nan Wu, et al. ∙

research

∙ 02/08/2022

Generative multitask learning mitigates target-causing confounding

We propose a simple and scalable approach to causal representation learn...

8 Taro Makino, et al. ∙

research

∙ 02/08/2022

Causal Scene BERT: Improving object detection by searching for challenging groups of data

Modern computer vision applications rely on learning-based perception mo...

7 Cinjon Resnick, et al. ∙

research

∙ 12/28/2021

LINDA: Unsupervised Learning to Interpolate in Natural Language Processing

Despite the success of mixup in data augmentation, its applicability to ...

11 Yekyung Kim, et al. ∙

research

∙ 12/16/2021

Characterizing and addressing the issue of oversmoothing in neural autoregressive sequence modeling

Neural autoregressive sequence models smear the probability among many p...

0 Ilia Kulikov, et al. ∙

research

∙ 12/16/2021

Amortized Noisy Channel Neural Machine Translation

Noisy channel models have been especially effective in neural machine tr...

1 Richard Yuanzhe Pang, et al. ∙

research

∙ 11/14/2021

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

It has been shown that machine translation models usually generate poor ...

5 Junjie Hu, et al. ∙

research

∙ 11/03/2021

AlphaD3M: Machine Learning Pipeline Synthesis

We introduce AlphaD3M, an automatic machine learning (AutoML) system bas...

18 Iddo Drori, et al. ∙

research

∙ 10/18/2021

Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement

Recent work in simultaneous machine translation is often trained with co...

6 Hyojung Han, et al. ∙

research

∙ 09/21/2021

Chemical-Reaction-Aware Molecule Representation Learning

Molecule representation learning (MRL) methods aim to embed molecules in...

16 Hongwei Wang, et al. ∙

research

∙ 09/16/2021

Stereo Video Reconstruction Without Explicit Depth Maps for Endoscopic Surgery

We introduce the task of stereo video reconstruction or, equivalently, 2...

29 Annika Brundyn, et al. ∙

research

∙ 09/06/2021

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

Prompt-based knowledge probing for 1-hop relations has been used to meas...

52 Tianxing He, et al. ∙

research

∙ 08/10/2021

Meta-repository of screening mammography classifiers

Artificial intelligence (AI) is transforming medicine and showing promis...

8 Benjamin Stadnick, et al. ∙

research

∙ 07/26/2021

AAVAE: Augmentation-Augmented Variational Autoencoders

Recent methods for self-supervised learning can be grouped into two para...

4 William Falcon, et al. ∙

research

∙ 06/10/2021

Mode recovery in neural autoregressive sequence modeling

Despite its wide use, recent studies have revealed unexpected and undesi...

5 Ilia Kulikov, et al. ∙

research

∙ 06/01/2021

Comparing Test Sets with Item Response Theory

Recent years have seen numerous NLP datasets introduced to evaluate the ...

7 Clara Vania, et al. ∙

research

∙ 05/24/2021

True Few-Shot Learning with Language Models

Pretrained language models (LMs) perform well on many tasks even when le...

12 Ethan Perez, et al. ∙

Kyunghyun Cho

Featured Co-authors

Sign in with Google

Consider DeepAI Pro