b'Mohit Bansal'

research

∙ 09/18/2023

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

The canonical approach to video-text retrieval leverages a coarse-graine...

0 Ziyang Wang, et al. ∙

research

∙ 07/05/2023

Exploring Continual Learning for Code Generation Models

Large-scale code generation models such as Codex and CodeT5 have achieve...

0 Prateek Yadav, et al. ∙

research

∙ 07/04/2023

On Conditional and Compositional Language Model Differentiable Prompting

Prompts have been shown to be an effective method to adapt a frozen Pret...

0 Jonathan Pilault, et al. ∙

research

∙ 06/15/2023

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind

Large Language Models (LLMs) perform complex reasoning by generating exp...

9 Swarnadeep Saha, et al. ∙

research

∙ 06/09/2023

Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects

Biological vision systems make adaptive use of context to recognize obje...

3 Zhuofan Ying, et al. ∙

research

∙ 06/02/2023

Resolving Interference When Merging Models

Transfer learning - i.e., further fine-tuning a pre-trained model on a d...

2 Prateek Yadav, et al. ∙

research

∙ 05/30/2023

PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) requires the agent to follow langua...

2 Jialu Li, et al. ∙

research

∙ 05/27/2023

Non-Sequential Graph Script Induction via Multimedia Grounding

Online resources such as WikiHow compile a wide range of scripts for per...

4 Yu Zhou, et al. ∙

research

∙ 05/26/2023

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Autoregressive language models are trained by minimizing the cross-entro...

1 Shiyue Zhang, et al. ∙

research

∙ 05/24/2023

Visual Programming for Text-to-Image Generation and Evaluation

As large language models have demonstrated impressive performance in man...

8 Jaemin Cho, et al. ∙

research

∙ 05/19/2023

Any-to-Any Generation via Composable Diffusion

We present Composable Diffusion (CoDi), a novel generative model capable...

3 Zineng Tang, et al. ∙

research

∙ 05/18/2023

Paxion: Patching Action Knowledge in Video-Language Foundation Models

Action knowledge involves the understanding of textual, visual, and temp...

3 Zhenhailong Wang, et al. ∙

research

∙ 05/11/2023

Self-Chained Image-Language Model for Video Localization and Question Answering

Recent studies have shown promising results on utilizing pre-trained ima...

3 Shoubin Yu, et al. ∙

research

∙ 05/08/2023

HistAlign: Improving Context Dependency in Language Generation by Aligning with History

Language models (LMs) can generate hallucinations and incoherent outputs...

3 David Wan, et al. ∙

research

∙ 04/28/2023

An Empirical Study of Multimodal Model Merging

Model merging (e.g., via interpolation or task arithmetic) fuses multipl...

3 Yi-Lin Sung, et al. ∙

research

∙ 04/21/2023

ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness

Multi-step reasoning ability is fundamental to many natural language tas...

4 Archiki Prasad, et al. ∙

research

∙ 04/13/2023

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Spatial control is a core capability in controllable image generation. A...

4 Jaemin Cho, et al. ∙

research

∙ 04/11/2023

Improving Vision-and-Language Navigation by Generating Future-View Image Semantics

Vision-and-Language Navigation (VLN) is the task that requires an agent ...

3 Jialu Li, et al. ∙

research

∙ 03/29/2023

Hierarchical Video-Moment Retrieval and Step-Captioning

There is growing interest in searching for information from large video ...

6 Abhay Zala, et al. ∙

research

∙ 03/28/2023

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

As general purpose vision models get increasingly effective at a wide se...

0 Adyasha Maharana, et al. ∙

research

∙ 03/06/2023

Faithfulness-Aware Decoding Strategies for Abstractive Summarization

Despite significant progress in understanding and improving faithfulness...

3 David Wan, et al. ∙

research

∙ 01/10/2023

Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models

Language models are known to learn a great quantity of factual informati...

2 Peter Hase, et al. ∙

research

∙ 12/16/2022

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation

Prompting large language models has enabled significant recent progress ...

6 Swarnadeep Saha, et al. ∙

research

∙ 12/15/2022

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Vision transformers (ViTs) have achieved impressive results on various c...

1 Yan-Bo Lin, et al. ∙

research

∙ 12/09/2022

VindLU: A Recipe for Effective Video-and-Language Pretraining

The last several years have witnessed remarkable progress in video-and-l...

6 Feng Cheng, et al. ∙

research

∙ 11/28/2022

Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality

Recent datasets expose the lack of the systematic generalization ability...

11 Yichen Jiang, et al. ∙

research

∙ 11/21/2022

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention

We present Perceiver-VL, a vision-and-language framework that efficientl...

6 Zineng Tang, et al. ∙

research

∙ 11/15/2022

Evaluating the Factual Consistency of Large Language Models Through Summarization

While large language models (LLMs) have proven to be effective on a larg...

6 Derek Tam, et al. ∙

research

∙ 11/14/2022

Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

Recent work on explainable NLP has shown that few-shot prompting can ena...

5 Swarnadeep Saha, et al. ∙

research

∙ 11/04/2022

Evaluating and Improving Factuality in Multimodal Abstractive Summarization

Current metrics for evaluating factuality for abstractive document summa...

0 David Wan, et al. ∙

research

∙ 10/18/2022

Exclusive Supermask Subnetwork Training for Continual Learning

Continual Learning (CL) methods mainly focus on avoiding catastrophic fo...

1 Prateek Yadav, et al. ∙

research

∙ 09/28/2022

TVLT: Textless Vision-Language Transformer

In this work, we present the Textless Vision-Language Transformer (TVLT)...

4 Zineng Tang, et al. ∙

research

∙ 09/21/2022

Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees

Current abstractive summarization models either suffer from a lack of cl...

10 Swarnadeep Saha, et al. ∙

research

∙ 09/13/2022

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

Recent advances in text-to-image synthesis have led to large pretrained ...

4 Adyasha Maharana, et al. ∙

research

∙ 09/08/2022

Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization

The problems of unfaithful summaries have been widely discussed under th...

7 Shiyue Zhang, et al. ∙

research

∙ 07/25/2022

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

While vision-and-language models perform well on tasks such as visual qu...

1 Yonatan Bitton, et al. ∙

research

∙ 07/08/2022

CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination

As humans, we can modify our assumptions about a scene by imagining alte...

3 Hyounghun Kim, et al. ∙

research

∙ 07/08/2022

SETSum: Summarization and Visualization of Student Evaluations of Teaching

Student Evaluations of Teaching (SETs) are widely used in colleges and u...

3 Yinuo Hu, et al. ∙

research

∙ 07/05/2022

CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations

Vision-and-Language Navigation (VLN) tasks require an agent to navigate ...

4 Jialu Li, et al. ∙

research

∙ 06/30/2022

Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?

Previous Part-Of-Speech (POS) induction models usually assume certain in...

11 Xiang Zhou, et al. ∙

research

∙ 06/22/2022

VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives

Many past works aim to improve visual reasoning in models by supervising...

8 Zhuofan Ying, et al. ∙

research

∙ 06/13/2022

LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning

Fine-tuning large pre-trained models on downstream tasks has been adopte...

5 Yi-Lin Sung, et al. ∙

research

∙ 06/07/2022

Revealing Single Frame Bias for Video-and-Language Learning

Training an effective video-and-language model intuitively requires mult...

1 Jie Lei, et al. ∙

research

∙ 05/26/2022

Fine-grained Image Captioning with CLIP Reward

Modern image captioning models are usually trained with text similarity ...

8 Jaemin Cho, et al. ∙

research

∙ 05/22/2022

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

The goal of this work is to build flexible video-language models that ca...

11 Zhenhailong Wang, et al. ∙

research

∙ 05/18/2022

On the Limits of Evaluating Embodied Agent Model Generalization Using Validation Sets

Natural language guided embodied task completion is a challenging proble...

3 Hyounghun Kim, et al. ∙

research

∙ 05/16/2022

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

We present FactPEGASUS, an abstractive summarization model that addresse...

7 David Wan, et al. ∙

research

∙ 05/11/2022

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Few-shot in-context learning (ICL) enables pre-trained language models t...

16 Haokun Liu, et al. ∙

research

∙ 05/04/2022

Efficient Few-Shot Fine-Tuning for Opinion Summarization

Abstractive summarization models are typically pre-trained on large amou...

3 Arthur Bražinskas, et al. ∙

research

∙ 04/29/2022

How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?

A multilingual tokenizer is a fundamental component of multilingual neur...

2 Shiyue Zhang, et al. ∙

Mohit Bansal

Featured Co-authors

Sign in with Google

Consider DeepAI Pro