DISCO: Distilling Phrasal Counterfactuals with Large Language Models

12/20/2022

∙

Recent methods demonstrate that data augmentation using counterfactual knowledge can teach models the causal structure of a task, leading to robust and generalizable models. However, such counterfactual data often has a limited scale and diversity if crowdsourced and is computationally expensive to extend to new perturbation types if generated using supervised methods. To address this, we introduce a new framework called DISCO for automatically generating high-quality counterfactual data at scale. DISCO engineers prompts to generate phrasal perturbations with a large general language model. Then, a task-specific teacher model filters the generation to distill high-quality counterfactual data. We show that learning with this counterfactual data yields a comparatively small student model that is 6 generalizes 5 challenging evaluations. This model is also 15 differentiating original and counterfactual examples, on three evaluation sets written by human workers and via human-AI collaboration.

READ FULL TEXT

DISCO: Distilling Phrasal Counterfactuals with Large Language Models

Sign in with Google

Consider DeepAI Pro