
-
WiCV 2020: The Seventh Women In Computer Vision Workshop
In this paper we present the details of Women in Computer Vision Worksho...
read it
-
Look Before you Speak: Visually Contextualized Utterances
While most conversational AI systems focus on textual dialogue only, con...
read it
-
Cough Against COVID: Evidence of COVID-19 Signature in Cough Sounds
Testing capacity for COVID-19 remains a challenge globally due to the la...
read it
-
Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos
Despite the recent advances in video classification, progress in spatio-...
read it
-
Spot the conversation: speaker diarisation in the wild
The goal of this paper is speaker diarisation of videos collected 'in th...
read it
-
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Our objective in this work is the long range understanding of the narrat...
read it
-
Speech2Action: Cross-modal Supervision for Action Recognition
Is it possible to guess human action from dialogue alone? In this work w...
read it
-
Disentangled Speech Embeddings using Cross-modal Self-supervision
The objective of this paper is to learn representations of speaker ident...
read it
-
VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge
The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well...
read it
-
WiCV 2019: The Sixth Women In Computer Vision Workshop
In this paper we present the Women in Computer Vision Workshop - WiCV 20...
read it
-
Count, Crop and Recognise: Fine-Grained Recognition in the Wild
The goal of this paper is to label all the animal individuals present in...
read it
-
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition
We focus on multi-modal fusion for egocentric action recognition, and pr...
read it
-
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
The rapid growth of video on the internet has made searching for video c...
read it
-
Utterance-level Aggregation For Speaker Recognition In The Wild
The objective of this paper is speaker recognition "in the wild"-where u...
read it
-
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Obtaining large, human labelled speech datasets to train models for emot...
read it
-
VoxCeleb2: Deep Speaker Recognition
The objective of this paper is speaker recognition under noisy and uncon...
read it
-
Learnable PINs: Cross-Modal Embeddings for Person Identity
We propose and investigate an identity sensitive joint embedding of face...
read it
-
Seeing Voices and Hearing Faces: Cross-modal biometric matching
We introduce a seemingly impossible task: given only an audio clip of so...
read it
-
From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script
The goal of this paper is the automatic identification of characters in ...
read it
-
VoxCeleb: a large-scale speaker identification dataset
Most existing datasets for speaker identification contain samples obtain...
read it