b'Lucas Smaira'

research

∙ 05/23/2023

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

We propose a novel multimodal video benchmark - the Perception Test - to...

0 Viorica Patraucean, et al. ∙

research

∙ 01/23/2023

Zorro: the masked multimodal transformer

Attention-based models are appealing for multimodal processing because i...

17 Adria Recasens, et al. ∙

research

∙ 11/07/2022

TAP-Vid: A Benchmark for Tracking Any Point in a Video

Generic motion understanding from video involves not only tracking objec...

0 Carl Doersch, et al. ∙

research

∙ 11/23/2021

Towards Learning Universal Audio Representations

The ability to learn universal audio representations that can solve dive...

0 Luyu Wang, et al. ∙

research

∙ 11/28/2020

Human-Agent Cooperation in Bridge Bidding

We introduce a human-compatible reinforcement-learning approach to a coo...

0 Edward Lockhart, et al. ∙

research

∙ 10/21/2020

A Short Note on the Kinetics-700-2020 Human Action Dataset

We describe the 2020 edition of the DeepMind Kinetics human action datas...

0 Lucas Smaira, et al. ∙

research

∙ 06/29/2020

Self-Supervised MultiModal Versatile Networks

Videos are a rich source of multi-modal supervision. In this work, we le...

82 Jean-Baptiste Alayrac, et al. ∙

research

∙ 03/11/2020

Visual Grounding in Video for Unsupervised Word Translation

There are thousands of actively spoken languages on Earth, but a single ...

8 Gunnar A. Sigurdsson, et al. ∙

research

∙ 12/13/2019

End-to-End Learning of Visual Representations from Uncurated Instructional Videos

Annotating videos is cumbersome, expensive and not scalable. Yet, many s...

35 Antoine Miech, et al. ∙

Lucas Smaira

Featured Co-authors

Sign in with Google

Consider DeepAI Pro