Alexander M. Rush

research

∙ 06/21/2023

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Large multimodal models trained on natural documents, which interleave i...

0 Hugo Laurençon, et al. ∙

research

∙ 05/25/2023

Scaling Data-Constrained Language Models

The current trend of scaling language models involves increasing both pa...

0 Niklas Muennighoff, et al. ∙

research

∙ 05/24/2023

Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations

Abductive reasoning aims to find plausible explanations for an event. Th...

0 Wenting Zhao, et al. ∙

research

∙ 05/23/2023

HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision

Explainable multi-hop question answering (QA) not only predicts answers ...

0 Wenting Zhao, et al. ∙

research

∙ 12/20/2022

Pretraining Without Attention

Transformers have been essential to pretraining success in NLP. Other ar...

0 Junxiong Wang, et al. ∙

research

∙ 10/25/2022

Teal: Learning-Accelerated Optimization of Traffic Engineering

In the last decade, global cloud wide-area networks (WANs) have grown 10...

0 Zhiying Xu, et al. ∙

research

∙ 10/24/2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Speech recognition applications cover a range of different audio and tex...

0 Sanchit Gandhi, et al. ∙

research

∙ 10/20/2022

Unsupervised Text Deidentification

Deidentification seeks to anonymize textual data prior to distribution. ...

0 John X. Morris, et al. ∙

research

∙ 10/16/2022

Model Criticism for Long-Form Text Generation

Language models have demonstrated the ability to generate highly fluent ...

0 Yuntian Deng, et al. ∙

research

∙ 10/11/2022

Markup-to-Image Diffusion Models with Scheduled Sampling

Building on recent advances in image generation, we present a fully data...

0 Yuntian Deng, et al. ∙

research

∙ 08/16/2022

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

State-of-the-art neural language models can now be used to solve ad-hoc ...

8 Hendrik Strobelt, et al. ∙

research

∙ 02/02/2022

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

PromptSource is a system for creating, sharing, and using natural langua...

17 Stephen H. Bach, et al. ∙

research

∙ 01/08/2022

Low-Rank Constraints for Fast Inference in Structured Models

Structured distributions, i.e. distributions over combinatorial spaces, ...

17 Justin T. Chiu, et al. ∙

research

∙ 10/19/2021

GenNI: Human-AI Collaboration for Data-Backed Text Generation

Table2Text systems generate textual output based on structured data util...

5 Hendrik Strobelt, et al. ∙

research

∙ 10/15/2021

Multitask Prompted Training Enables Zero-Shot Task Generalization

Large language models have recently been shown to attain reasonable zero...

10 Victor Sanh, et al. ∙

research

∙ 09/14/2021

Rationales for Sequential Predictions

Sequence models are a critical component of modern NLP systems, but thei...

14 Keyon Vafa, et al. ∙

research

∙ 09/10/2021

Block Pruning For Faster Transformers

Pre-training has improved model accuracy for both classification and gen...

1 François Lagunas, et al. ∙

research

∙ 09/07/2021

Datasets: A Community Library for Natural Language Processing

The scale, variety, and quantity of publicly-available NLP datasets has ...

6 Quentin Lhoest, et al. ∙

research

∙ 04/08/2021

Low-Complexity Probing via Finding Subnetworks

The dominant approach in probing neural networks for linguistic properti...

11 Steven Cao, et al. ∙

research

∙ 03/15/2021

How Many Data Points is a Prompt Worth?

When fine-tuning pretrained models for classification, researchers eithe...

0 Teven Le Scao, et al. ∙

research

∙ 02/25/2021

Named Tensor Notation

We propose a notation for tensors with named axes, which relieves the au...

0 David Chiang, et al. ∙

research

∙ 12/14/2020

Parameter-Efficient Transfer Learning with Diff Pruning

While task-specific finetuning of pretrained networks has led to signifi...

0 Demi Guo, et al. ∙

research

∙ 12/02/2020

Learning from others' mistakes: Avoiding dataset biases without modeling them

State-of-the-art natural language processing (NLP) models often learn to...

10 Victor Sanh, et al. ∙

research

∙ 11/29/2020

Latent Template Induction with Gumbel-CRFs

Learning to control the structure of sentences is a challenging problem ...

0 Yao Fu, et al. ∙

research

∙ 11/28/2020

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference

Transformer-based language models such as BERT provide significant accur...

10 Thierry Tambe, et al. ∙

research

∙ 11/18/2020

Sequence-Level Mixed Sample Data Augmentation

Despite their empirical success, neural networks still have difficulty c...

0 Demi Guo, et al. ∙

research

∙ 11/09/2020

Adversarial Semantic Collisions

We study semantic collisions: texts that are semantically unrelated but ...

0 Congzheng Song, et al. ∙

research

∙ 11/09/2020

Scaling Hidden Markov Language Models

The hidden Markov model (HMM) is a fundamental tool for sequence modelin...

0 Justin T. Chiu, et al. ∙

research

∙ 10/24/2020

Pre-trained Summarization Distillation

Recent state-of-the-art approaches to summarization utilize large pre-tr...

0 Sam Shleifer, et al. ∙

research

∙ 07/10/2020

MiniConf – A Virtual Conference Framework

MiniConf is a framework for hosting virtual academic conferences motivat...

0 Alexander M. Rush, et al. ∙

research

∙ 06/01/2020

Cascaded Text Generation with Markov Transformers

The two dominant approaches to neural text generation are fully autoregr...

31 Yuntian Deng, et al. ∙

research

∙ 05/15/2020

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Magnitude pruning is a widely used strategy for reducing model size in p...

0 Victor Sanh, et al. ∙

research

∙ 05/10/2020

Posterior Control of Blackbox Generation

Text generation often requires high-precision output that obeys task-spe...

0 Xiang Lisa Li, et al. ∙

research

∙ 05/04/2020

What is Learned in Visually Grounded Neural Syntax Acquisition

Visual features are a promising signal for learning bootstrap textual mo...

0 Noriyuki Kojima, et al. ∙

research

∙ 03/13/2020

Automating Botnet Detection with Graph Neural Networks

Botnets are now a major source for many network attacks, such as DDoS at...

0 Jiawei Zhou, et al. ∙

research

∙ 02/03/2020

Torch-Struct: Deep Structured Prediction Library

The literature on structured prediction for NLP describes a rich collect...

48 Alexander M. Rush, et al. ∙

research

∙ 11/13/2019

A Hierarchy of Graph Neural Networks Based on Learnable Local Features

Graph neural networks (GNNs) are a powerful tool to learn representation...

33 Michael Lingzhi Li, et al. ∙

research

∙ 09/03/2019

Neural Linguistic Steganography

Whereas traditional cryptography encrypts a secret message into an unint...

0 Zachary M. Ziegler, et al. ∙

research

∙ 09/02/2019

Commonsense Knowledge Mining from Pretrained Models

Inferring commonsense knowledge is a key challenge in natural language p...

0 Joshua Feldman, et al. ∙

research

∙ 08/19/2019

Encoder-Agnostic Adaptation for Conditional Language Generation

Large pretrained language models have changed the way researchers approa...

0 Zachary M. Ziegler, et al. ∙

research

∙ 07/31/2019

Simple Unsupervised Summarization by Contextual Matching

We propose an unsupervised method for sentence summarization using only ...

0 Jiawei Zhou, et al. ∙

research

∙ 07/24/2019

Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Automation of tasks can have critical consequences when humans lose agen...

14 Sebastian Gehrmann, et al. ∙

research

∙ 07/09/2019

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

Popular Natural Language Inference (NLI) datasets have been shown to be ...

0 Yonatan Belinkov, et al. ∙

research

∙ 07/09/2019

Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference

Natural Language Inference (NLI) datasets often contain hypothesis-only ...

0 Yonatan Belinkov, et al. ∙

research

∙ 06/24/2019

Compound Probabilistic Context-Free Grammars for Grammar Induction

We study a formalization of the grammar induction problem that models se...

0 Yoon Kim, et al. ∙

research

∙ 06/10/2019

GLTR: Statistical Detection and Visualization of Generated Text

The rapid improvement of language models has raised the specter of abuse...

0 Sebastian Gehrmann, et al. ∙

research

∙ 04/07/2019

Unsupervised Recurrent Neural Network Grammars

Recurrent neural network grammars (RNNG) are generative models of langua...

0 Yoon Kim, et al. ∙

research

∙ 01/29/2019

Latent Normalizing Flows for Discrete Sequences

Normalizing flows have been shown to be a powerful class of generative m...

0 Zachary M. Ziegler, et al. ∙

research

∙ 12/17/2018

A Tutorial on Deep Latent Variable Models of Natural Language

There has been much recent, exciting work on combining the complementary...

0 Yoon Kim, et al. ∙

research

∙ 10/10/2018

End-to-End Content and Plan Selection for Data-to-Text Generation

Learning to generate fluent natural language from structured data with n...

2 Sebastian Gehrmann, et al. ∙

Alexander M. Rush

Featured Co-authors

Sign in with Google

Consider DeepAI Pro