b'Xiang Ren'

research

∙ 07/31/2023

Virtual Prompt Injection for Instruction-Tuned Large Language Models

We present Virtual Prompt Injection (VPI) for instruction-tuned Large La...

0 Jun Yan, et al. ∙

research

∙ 07/20/2023

Instruction-following Evaluation through Verbalizer Manipulation

While instruction-tuned models have shown remarkable success in various ...

0 Shiyang Li, et al. ∙

research

∙ 05/24/2023

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench

We investigate the predictability of large language model (LLM) capabili...

0 Qinyuan Ye, et al. ∙

research

∙ 05/24/2023

Estimating Large Language Model Capabilities without Labeled Test Data

Large Language Models (LLMs) have exhibited an impressive ability to per...

16 Harvey Yiyun Fu, et al. ∙

research

∙ 05/24/2023

GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions

Generalization to unseen tasks is an important ability for few-shot lear...

0 Woojeong Jin, et al. ∙

research

∙ 05/03/2023

SCOTT: Self-Consistent Chain-of-Thought Distillation

Large language models (LMs) beyond a certain scale, demonstrate the emer...

5 Peifeng Wang, et al. ∙

research

∙ 03/16/2023

Exploring Distributional Shifts in Large Language Models for Code Analysis

We systematically study the capacity of two large language models for co...

0 Shushan Arakelyan, et al. ∙

research

∙ 12/19/2022

Dataless Knowledge Fusion by Merging Weights of Language Models

Fine-tuning pre-trained language models has become the prevalent paradig...

0 Xisen Jin, et al. ∙

research

∙ 12/19/2022

KNIFE: Knowledge Distillation with Free-Text Rationales

Free-text rationales (FTRs) follow how humans communicate by explaining ...

0 Aaron Chan, et al. ∙

research

∙ 12/19/2022

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Logical reasoning of text is an important ability that requires understa...

0 Soumya Sanyal, et al. ∙

research

∙ 11/28/2022

CoNAL: Anticipating Outliers with Large Language Models

In many task settings, text classification models are likely to encounte...

0 Albert Xu, et al. ∙

research

∙ 11/03/2022

PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

Neural language models (LMs) have achieved impressive results on various...

0 Peifeng Wang, et al. ∙

research

∙ 10/30/2022

XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models

NLP models are susceptible to learning spurious biases (i.e., bugs) that...

0 Dong-Ho Lee, et al. ∙

research

∙ 10/18/2022

MMGA: Multimodal Learning with Graph Alignment

Multimodal pre-training breaks down the modality barriers and allows the...

0 Xuan Yang, et al. ∙

research

∙ 07/29/2022

Curriculum Learning for Data-Efficient Vision-Language Alignment

Aligning image and text encoders from scratch using contrastive learning...

10 Tejas Srinivasan, et al. ∙

research

∙ 07/18/2022

Retweet-BERT: Political Leaning Detection Using Language Features and Information Diffusion on Social Networks

Estimating the political leanings of social media users is a challenging...

0 Julie Jiang, et al. ∙

research

∙ 07/02/2022

FRAME: Evaluating Simulatability Metrics for Free-Text Rationales

Free-text rationales aim to explain neural language model (LM) behavior ...

0 Aaron Chan, et al. ∙

research

∙ 06/14/2022

NewsEdits: A News Article Revision Dataset and a Document-Level Reasoning Challenge

News article revision histories provide clues to narrative and factual e...

4 Alexander Spangher, et al. ∙

research

∙ 05/25/2022

Eliciting Transferability in Multi-task Learning with Task-level Mixture-of-Experts

Recent work suggests that transformer models are capable of multi-task l...

0 Qinyuan Ye, et al. ∙

research

∙ 05/25/2022

Textual Backdoor Attacks with Iterative Trigger Injection

The backdoor attack has become an emerging threat for Natural Language P...

0 Jun Yan, et al. ∙

research

∙ 05/25/2022

RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning

Transformers have been shown to be able to perform deductive reasoning o...

0 Soumya Sanyal, et al. ∙

research

∙ 05/25/2022

ER-TEST: Evaluating Explanation Regularization Methods for NLP Models

Neural language models' (NLMs') reasoning processes are notoriously hard...

0 Brihi Joshi, et al. ∙

research

∙ 05/25/2022

Machine Translation Robustness to Natural Asemantic Variation

We introduce and formalize an under-studied linguistic phenomenon we cal...

0 Jacob Bremerman, et al. ∙

research

∙ 05/23/2022

Cross-lingual Lifelong Learning

The longstanding goal of multi-lingual learning has been to develop a un...

7 Meryem M'hamdi, et al. ∙

research

∙ 05/21/2022

NS3: Neuro-Symbolic Semantic Code Search

Semantic code search is the task of retrieving a code snippet given a te...

3 Shushan Arakelyan, et al. ∙

research

∙ 03/19/2022

FaiRR: Faithful and Robust Deductive Reasoning over Natural Language

Transformers have been shown to be able to perform deductive reasoning o...

9 Soumya Sanyal, et al. ∙

research

∙ 03/14/2022

Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-modal Knowledge Transfer

Pre-trained language models are still far from human performance in task...

0 Woojeong Jin, et al. ∙

research

∙ 12/16/2021

UniREx: A Unified Learning Framework for Language Model Rationale Extraction

An extractive rationale explains a language model's (LM's) prediction on...

0 Aaron Chan, et al. ∙

research

∙ 12/12/2021

Contextualized Scene Imagination for Generative Commonsense Reasoning

Humans use natural language to compose common concepts from their enviro...

1 Peifeng Wang, et al. ∙

research

∙ 10/16/2021

Sparse Distillation: Speeding Up Text Classification by Using Bigger Models

Distilling state-of-the-art transformer models into lightweight student ...

0 Qinyuan Ye, et al. ∙

research

∙ 10/16/2021

Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

Pretrained language models (PTLMs) are typically learned over a large, s...

0 Xisen Jin, et al. ∙

research

∙ 10/16/2021

A Good Prompt Is Worth Millions of Parameters? Low-resource Prompt-based Learning for Vision-Language Models

Large pretrained vision-language (VL) models can learn a new task with a...

0 Woojeong Jin, et al. ∙

research

∙ 10/16/2021

Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER

Recent advances in prompt-based learning have shown impressive results o...

0 Dong-Ho Lee, et al. ∙

research

∙ 10/08/2021

KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering

Current Open-Domain Question Answering (ODQA) model paradigm often conta...

0 Donghan Yu, et al. ∙

research

∙ 08/31/2021

Discretized Integrated Gradients for Explaining Language Models

As a prominent attribution-based explanation algorithm, Integrated Gradi...

0 Soumya Sanyal, et al. ∙

research

∙ 08/03/2021

Improving Counterfactual Generation for Fair Hate Speech Detection

Bias mitigation approaches reduce models' dependence on sensitive featur...

0 Aida Mostafazadeh Davani, et al. ∙

research

∙ 06/22/2021

Do Language Models Perform Generalizable Commonsense Inference?

Inspired by evidence that pretrained language models (LMs) encode common...

16 Peifeng Wang, et al. ∙

research

∙ 06/04/2021

AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding

Automatic extraction of product attribute values is an important enablin...

0 Jun Yan, et al. ∙

research

∙ 04/20/2021

X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering

Multilingual models, such as M-BERT and XLM-R, have gained increasing po...

11 Meryem M'hamdi, et al. ∙

research

∙ 04/18/2021

On the Influence of Masking Policies in Intermediate Pre-training

Current NLP models are predominantly trained through a pretrain-then-fin...

0 Qinyuan Ye, et al. ∙

research

∙ 04/18/2021

Lifelong Learning of Few-shot Learners across NLP Tasks

Recent advances in large pre-trained language models have greatly improv...

0 Xisen Jin, et al. ∙

research

∙ 04/18/2021

SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning

Augmenting pre-trained language models with knowledge graphs (KGs) has a...

0 Aaron Chan, et al. ∙

research

∙ 04/18/2021

On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation

We study the power of cross-attention in the Transformer architecture wi...

10 Mozhdeh Gheini, et al. ∙

research

∙ 04/18/2021

Extract, Denoise, and Enforce: Evaluating and Predicting Lexical Constraints for Conditional Text Generation

Recently, pre-trained language models (PLMs) have dominated conditional ...

0 Yuning Mao, et al. ∙

research

∙ 03/18/2021

Refining Neural Networks with Compositional Explanations

Neural networks are prone to learning spurious correlations from biased ...

0 Huihan Yao, et al. ∙

research

∙ 01/06/2021

Modality-specific Distillation

Large neural networks are impractical to deploy on mobile devices due to...

0 Woojeong Jin, et al. ∙

research

∙ 01/02/2021

Zero-shot Learning by Generating Task-specific Adapters

Pre-trained text-to-text transformers achieve impressive performance acr...

0 Qinyuan Ye, et al. ∙

research

∙ 12/31/2020

Studying Strategically: Learning to Mask for Closed-book QA

Closed-book question-answering (QA) is a challenging task that requires ...

0 Qinyuan Ye, et al. ∙

research

∙ 12/30/2020

DEER: A Data Efficient Language Model for Event Temporal Reasoning

Pretrained language models (LMs) such as BERT, RoBERTa, and ELECTRA are ...

28 Rujun Han, et al. ∙

research

∙ 10/24/2020

Learning Contextualized Knowledge Structures for Commonsense Reasoning

Recently, neural-symbolic architectures have achieved success on commons...

0 Jun Yan, et al. ∙

Xiang Ren

Featured Co-authors

Sign in with Google

Consider DeepAI Pro