Suchin Gururangan

research

∙ 08/08/2023

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

The legality of training language models (LMs) on copyrighted or otherwi...

0 Sewon Min, et al. ∙

research

∙ 06/05/2023

Information Flow Control in Machine Learning through Modular Model Architecture

In today's machine learning (ML) models, any part of the training data c...

0 Trishita Tiwari, et al. ∙

research

∙ 03/24/2023

Scaling Expert Language Models with Unsupervised Domain Discovery

Large language models are typically trained densely: all parameters are ...

0 Suchin Gururangan, et al. ∙

research

∙ 12/08/2022

Editing Models with Task Arithmetic

Changing how pre-trained models behave – e.g., improving their performan...

0 Gabriel Ilharco, et al. ∙

research

∙ 10/19/2022

lo-fi: distributed fine-tuning without communication

When fine-tuning large neural networks, it is common to use multiple nod...

5 Mitchell Wortsman, et al. ∙

research

∙ 10/13/2022

M2D2: A Massively Multi-domain Language Modeling Dataset

We present M2D2, a fine-grained, massively multi-domain corpus for study...

0 Machel Reid, et al. ∙

research

∙ 08/05/2022

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

We present Branch-Train-Merge (BTM), a communication-efficient algorithm...

0 Margaret Li, et al. ∙

research

∙ 05/27/2022

Nearest Neighbor Zero-Shot Inference

We introduce kNN-Prompt, a simple and effective technique to use k-neare...

0 Weijia Shi, et al. ∙

research

∙ 01/25/2022

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection

Language models increasingly rely on massive web dumps for diverse text ...

0 Suchin Gururangan, et al. ∙

research

∙ 11/14/2021

Time Waits for No One! Analysis and Challenges of Temporal Misalignment

When an NLP model is trained on text data from one time period and teste...

0 Kelvin Luu, et al. ∙

research

∙ 10/01/2021

Expected Validation Performance and Estimation of a Random Variable's Maximum

Research in NLP is often supported by experimental results, and improved...

0 Jesse Dodge, et al. ∙

research

∙ 08/11/2021

DEMix Layers: Disentangling Domains for Modular Language Modeling

We introduce a new domain expert mixture (DEMix) layer that enables cond...

0 Suchin Gururangan, et al. ∙

research

∙ 06/30/2021

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text

Human evaluations are typically considered the gold standard in natural ...

0 Elizabeth Clark, et al. ∙

research

∙ 04/13/2021

Detoxifying Language Models Risks Marginalizing Minority Voices

Language models (LMs) must be both safe and equitable to be responsibly ...

12 Albert Xu, et al. ∙

research

∙ 09/24/2020

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

Pretrained neural language models (LMs) are prone to generating racist, ...

0 Samuel Gehman, et al. ∙

research

∙ 04/23/2020

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Language models pretrained on text from a wide variety of sources form t...

0 Suchin Gururangan, et al. ∙

research

∙ 09/06/2019

Show Your Work: Improved Reporting of Experimental Results

Research in natural language processing proceeds, in part, by demonstrat...

26 Jesse Dodge, et al. ∙

research

∙ 06/05/2019

Variational Pretraining for Semi-supervised Text Classification

We introduce VAMPIRE, a lightweight pretraining framework for effective ...

0 Suchin Gururangan, et al. ∙

research

∙ 03/06/2018

Annotation Artifacts in Natural Language Inference Data

Large-scale datasets for natural language inference are created by prese...

0 Suchin Gururangan, et al. ∙

Suchin Gururangan

Featured Co-authors

Sign in with Google

Consider DeepAI Pro