
-
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
The early phase of training has been shown to be important in two ways f...
read it
-
Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing
We present BRIDGE, a powerful sequential architecture for modeling depen...
read it
-
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
Intent detection is one of the core components of goal-oriented dialog s...
read it
-
Online Structured Meta-learning
Learning quickly is of great importance for machine intelligence deploye...
read it
-
Explaining and Improving Model Behavior with k Nearest Neighbor Representations
Interpretability techniques in NLP have mainly focused on understanding ...
read it
-
Explaining Creative Artifacts
Human creativity is often described as the mental process of combining a...
read it
-
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start
A standard way to address different NLP problems is by first constructin...
read it
-
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
We present GraPPa, an effective pre-training approach for table semantic...
read it
-
Composed Variational Natural Language Generation for Few-shot Intents
In this paper, we focus on generating training examples for few-shot int...
read it
-
GeDi: Generative Discriminator Guided Sequence Generation
Class-conditional language models (CC-LMs) can be used to generate natur...
read it
-
Central Yup'ik and Machine Translation of Low-Resource Polysynthetic Languages
Machine translation tools do not yet exist for the Yup'ik language, a po...
read it
-
Photon: A Robust Cross-Domain Text-to-SQL System
Natural language interfaces to databases (NLIDB) democratize end user ac...
read it
-
SummEval: Re-evaluating Summarization Evaluation
The scarcity of comprehensive up-to-date studies on evaluation metrics f...
read it
-
DART: Open-Domain Structured Data Record to Text Generation
We introduce DART, a large dataset for open-domain structured data recor...
read it
-
Theory-Inspired Path-Regularized Differential Network Architecture Search
Despite its high search efficiency, differential architecture search (DA...
read it
-
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Transformer architectures have proven to learn useful representations fo...
read it
-
Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Deep neural networks can empirically perform efficient hierarchical lear...
read it
-
A High-Quality Multilingual Dataset for Structured Documentation Translation
This paper presents a high-quality multilingual dataset for the document...
read it
-
CO-Search: COVID-19 Information Retrieval with Semantic Search, Question Answering, and Abstractive Summarization
The COVID-19 global pandemic has resulted in international efforts to un...
read it
-
WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
Online action detection in untrimmed videos aims to identify an action a...
read it
-
EMT: Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading
The goal of conversational machine reading is to answer user questions g...
read it
-
Prototypical Contrastive Learning of Unsupervised Representations
This paper presents Prototypical Contrastive Learning (PCL), an unsuperv...
read it
-
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
Training on only perfect Standard English corpora predisposes pre-traine...
read it
-
A Simple Language Model for Task-Oriented Dialogue
Task-oriented dialogue is often decomposed into three tasks: understandi...
read it
-
ESPRIT: Explaining Solutions to Physical Reasoning Tasks
Neural networks lack the ability to reason about qualitative physics and...
read it
-
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Tackling real-world socio-economic challenges requires designing and tes...
read it
-
ToD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogues
The use of pre-trained language models has emerged as a promising direct...
read it
-
Improving out-of-distribution generalization via multi-task self-supervised pretraining
Self-supervised feature representations have been shown to be useful for...
read it
-
Towards Noise-resistant Object Detection with Noisy Annotations
Training deep object detectors requires significant amount of human-anno...
read it
-
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning
We introduce a parameterization method called Neural Bayes which allows ...
read it
-
Tree-structured Attention with Hierarchical Accumulation
Incorporating hierarchical structures like constituency trees has been s...
read it
-
Non-Autoregressive Dialog State Tracking
Recent efforts in Dialogue State Tracking (DST) for task-oriented dialog...
read it
-
DivideMix: Learning with Noisy Labels as Semi-supervised Learning
Deep neural networks are known to be annotation-hungry. Numerous efforts...
read it
-
Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width
We propose Taylorized training as an initiative towards better understan...
read it
-
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Acquiring abilities in the absence of a task-oriented reward function is...
read it
-
Limits of Detecting Text Generated by Large-Scale Language Models
Some consider large-scale language models that can generate long and coh...
read it
-
Learning from Noisy Anchors for One-stage Object Detection
State-of-the-art object detectors rely on regressing and classifying an ...
read it
-
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
Answering questions that require multi-hop reasoning at web-scale necess...
read it
-
Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation for Pretrained Models
In this paper, we explore the knowledge distillation approach under the ...
read it
-
ERASER: A Benchmark to Evaluate Rationalized NLP Models
State-of-the-art models in NLP are now predominantly based on deep neura...
read it
-
Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards
While using shaped rewards can be beneficial when solving sparse reward ...
read it
-
Sketch-Fill-A-R: A Persona-Grounded Chit-Chat Generation Framework
Human-like chit-chat conversation requires agents to generate responses ...
read it
-
Evaluating the Factual Consistency of Abstractive Text Summarization
Currently used metrics for assessing summarization algorithms do not acc...
read it
-
Global Capacity Measures for Deep ReLU Networks via Path Sampling
Classical results on the statistical complexity of linear models have co...
read it
-
Find or Classify? Dual Strategy for Slot-Value Predictions on Multi-Domain Dialog State Tracking
Dialog State Tracking (DST) is a core component in task-oriented dialog ...
read it
-
Entropy Penalty: Towards Generalization Beyond the IID Assumption
It has been shown that instead of learning actual object features, deep ...
read it
-
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
We present CoSQL, a corpus for building cross-domain, general-purpose da...
read it
-
CTRL: A Conditional Transformer Language Model for Controllable Generation
Large-scale language models show promising text generation capabilities,...
read it
-
Pretrained AI Models: Performativity, Mobility, and Change
The paradigm of pretrained deep learning models has recently emerged in ...
read it
-
Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression
Text compression has diverse applications such as Summarization, Reading...
read it