Alexander Schwing

research

∙ 09/07/2023

Tracking Anything with Decoupled Video Segmentation

Training data for video segmentation are expensive to annotate. This imp...

0 Ho Kei Cheng, et al. ∙

research

∙ 12/08/2022

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

In this work, we present a novel framework built to simplify 3D asset ge...

0 Yen-Chi Cheng, et al. ∙

research

∙ 10/12/2022

On the Importance of Gradient Norm in PAC-Bayesian Bounds

Generalization bounds which assess the difference between the true risk ...

0 Itai Gat, et al. ∙

research

∙ 04/14/2022

Joint Forecasting of Panoptic Segmentations with Difference Attention

Forecasting of a representation is important for safe and effective auto...

10 Colin Graber, et al. ∙

research

∙ 12/20/2021

MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding

Recently, there has been an increasing interest in building question ans...

14 Revanth Gangi Reddy, et al. ∙

research

∙ 10/27/2021

Perceptual Score: What Data Modalities Does Your Model Perceive?

Machine learning advances in the last decade have relied significantly o...

0 Itai Gat, et al. ∙

research

∙ 10/24/2021

CoVA: Context-aware Visual Attention for Webpage Information Extraction

Webpage information extraction (WIE) is an important step to create know...

0 Anurendra Kumar, et al. ∙

research

∙ 10/12/2021

Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents

Communication between embodied AI agents has received increasing attenti...

8 Shivansh Patel, et al. ∙

research

∙ 08/26/2021

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

It is fundamental for personal robots to reliably navigate to a specifie...

7 Xiaoming Zhao, et al. ∙

research

∙ 08/04/2021

Towards Coherent Visual Storytelling with Ordered Image Attention

We address the problem of visual storytelling, i.e., generating a story ...

0 Tom Braude, et al. ∙

research

∙ 04/14/2021

GridToPix: Training Embodied Agents with Minimal Supervision

While deep reinforcement learning (RL) promises freedom from hand-labele...

0 Unnat Jain, et al. ∙

research

∙ 04/08/2021

Panoptic Segmentation Forecasting

Our goal is to forecast the near future given a set of recent observatio...

0 Colin Graber, et al. ∙

research

∙ 10/21/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Many recent datasets contain a variety of different data modalities, for...

0 Itai Gat, et al. ∙

research

∙ 10/06/2020

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

Variational autoencoders (VAEs) are one of the powerful likelihood-based...

29 Jyoti Aneja, et al. ∙

research

∙ 07/23/2020

Bridging the Imitation Gap by Adaptive Insubordination

Why do agents often obtain better reinforcement learning policies when i...

5 Luca Weihs, et al. ∙

research

∙ 07/09/2020

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Autonomous agents must learn to collaborate. It is not scalable to devel...

2 Unnat Jain, et al. ∙

research

∙ 04/21/2020

The 1st Agriculture-Vision Challenge: Methods and Results

The first Agriculture-Vision Challenge aims to encourage research in dev...

18 Mang Tik Chiu, et al. ∙

research

∙ 02/21/2020

Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning

In many vision-based reinforcement learning (RL) problems, the agent con...

8 Yuanyi Zhong, et al. ∙

research

∙ 01/05/2020

Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis

The success of deep learning in visual recognition tasks has driven adva...

11 Mang Tik Chiu, et al. ∙

research

∙ 12/14/2019

Calorimetry with Deep Learning: Particle Simulation and Reconstruction for Collider Physics

Using detailed simulations of calorimeter showers as training data, we i...

0 Dawit Belayneh, et al. ∙

research

∙ 10/31/2019

Graph Structured Prediction Energy Networks

For joint inference over multiple variables, a variety of structured pre...

0 Colin Graber, et al. ∙

research

∙ 08/22/2019

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning

Diverse and accurate vision+language modeling is an important goal to re...

12 Jyoti Aneja, et al. ∙

research

∙ 08/22/2019

ViCo: Word Embeddings from Visual Co-occurrences

We propose to learn word embeddings from visual co-occurrences. Two word...

0 Tanmay Gupta, et al. ∙

research

∙ 07/22/2019

Maya: Falsifying Power Sidechannels with Dynamic Control

The security of computers is at risk because of information leaking thro...

0 Raghavendra Pradyumna Pothukuchi, et al. ∙

research

∙ 04/11/2019

Factor Graph Attention

Dialog is an effective way to exchange information, but subtle details a...

0 Idan Schwartz, et al. ∙

research

∙ 04/11/2019

Two Body Problem: Collaborative Visual Task Completion

Collaboration is a necessary skill to perform tasks that are beyond one ...

12 Unnat Jain, et al. ∙

research

∙ 04/11/2019

Max-Sliced Wasserstein Distance and its use for GANs

Generative adversarial nets (GANs) and variational auto-encoders have si...

12 Ishan Deshpande, et al. ∙

research

∙ 04/11/2019

A Simple Baseline for Audio-Visual Scene-Aware Dialog

The recently proposed audio-visual scene-aware dialog task paves the way...

0 Idan Schwartz, et al. ∙

research

∙ 11/14/2018

No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniques

We show that with an appropriate factorization, and encodings of layout ...

6 Tanmay Gupta, et al. ∙

research

∙ 11/08/2018

Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

Distributed training of deep nets is an important technique to address s...

0 Youjie Li, et al. ∙

research

∙ 11/08/2018

GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training

Data parallelism can boost the training speed of convolutional neural ne...

0 Mingchao Yu, et al. ∙

research

∙ 11/01/2018

Deep Structured Prediction with Nonlinear Output Transformations

Deep structured models are widely used for tasks like semantic segmentat...

0 Colin Graber, et al. ∙

research

∙ 05/31/2018

Diverse and Controllable Image Captioning with Part-of-Speech Guidance

Automatically describing an image is an important capability for virtual...

0 Aditya Deshpande, et al. ∙

research

∙ 03/29/2018

Generative Modeling using the Sliced Wasserstein Distance

Generative Adversarial Nets (GANs) are very successful at modeling distr...

2 Ishan Deshpande, et al. ∙

research

∙ 03/29/2018

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

Human conversation is a complex mechanism with subtle nuances. It is hen...

0 Unnat Jain, et al. ∙

research

∙ 11/24/2017

Convolutional Image Captioning

Image captioning is an important but challenging task, applicable to vir...

0 Jyoti Aneja, et al. ∙

research

∙ 06/19/2017

Dualing GANs

Generative adversarial nets (GANs) are a promising technique for modelin...

0 Yujia Li, et al. ∙

research

∙ 04/11/2017

Creativity: Generating Diverse Questions using Variational Autoencoders

Generating diverse questions for given images is an important task for c...

0 Unnat Jain, et al. ∙

research

∙ 09/09/2015

Statistical Inference, Learning and Models in Big Data

The need for new methods to deal with big data is a common theme in most...

0 Beate Franke, et al. ∙

research

∙ 06/27/2012

Efficient Structured Prediction with Latent Variables for General Graphical Models

In this paper we propose a unified framework for structured prediction w...

0 Alexander Schwing, et al. ∙

Alexander Schwing

Featured Co-authors

Sign in with Google

Consider DeepAI Pro