Joon Son Chung

research

∙ 09/21/2023

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

The goal of this work is Active Speaker Detection (ASD), a task to deter...

0 Chaeyoung Jung, et al. ∙

research

∙ 09/21/2023

SlowFast Network for Continuous Sign Language Recognition

The objective of this work is the effective extraction of spatial and dy...

0 Junseok Ahn, et al. ∙

research

∙ 09/19/2023

Sound Source Localization is All about Cross-Modal Alignment

Humans can easily perceive the direction of sound sources in a visual sc...

0 Arda Senocak, et al. ∙

research

∙ 08/29/2023

Let There Be Sound: Reconstructing High Quality Speech from Silent Videos

The goal of this work is to reconstruct high quality speech from lip mot...

0 Ji-Hoon Kim, et al. ∙

research

∙ 07/18/2023

FlexiAST: Flexibility is What AST Needs

The objective of this work is to give patch-size flexibility to Audio Sp...

0 Jiu Feng, et al. ∙

research

∙ 04/06/2023

That's What I Said: Fully-Controllable Talking Face Generation

The goal of this paper is to synthesise talking faces with controllable ...

0 Youngjoon Jang, et al. ∙

research

∙ 03/30/2023

Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples

The objective of this work is to explore the learning of visually ground...

0 Hyeonggon Ryu, et al. ∙

research

∙ 03/21/2023

Self-Sufficient Framework for Continuous Sign Language Recognition

The goal of this work is to develop self-sufficient framework for Contin...

0 Youngjoon Jang, et al. ∙

research

∙ 02/27/2023

Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech

The goal of this work is zero-shot text-to-speech synthesis, with speaki...

0 Jiyoung Lee, et al. ∙

research

∙ 11/03/2022

MarginNCE: Robust Sound Localization with a Negative Margin

The goal of this work is to localize sound sources in visual scenes with...

0 Sooyoung Park, et al. ∙

research

∙ 11/01/2022

Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition

The goal of this work is background-robust continuous sign language reco...

0 Youngjoon Jang, et al. ∙

research

∙ 11/01/2022

Metric Learning for User-defined Keyword Spotting

The goal of this work is to detect new spoken terms defined by users. Wh...

0 Jaemin Jung, et al. ∙

research

∙ 11/01/2022

Disentangled representation learning for multilingual speaker recognition

The goal of this paper is to train speaker embeddings that are robust to...

0 Kihyun Nam, et al. ∙

research

∙ 10/20/2022

Large-scale learning of generalised representations for speaker recognition

The objective of this work is to develop a speaker recognition model to ...

0 Jee-weon Jung, et al. ∙

research

∙ 04/21/2022

Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion

Deep learning has brought impressive progress in the study of both autom...

0 Hye-Jin Shim, et al. ∙

research

∙ 03/16/2022

Raw waveform speaker verification for supervised and self-supervised learning

Speaker verification models that directly operate upon raw waveforms are...

0 Jee-weon Jung, et al. ∙

research

∙ 10/07/2021

Disentangled dimensionality reduction for noise-robust speaker diarisation

The objective of this work is to train noise-robust speaker embeddings f...

0 You Jin Kim, et al. ∙

research

∙ 10/07/2021

Multi-scale speaker embedding-based graph attention networks for speaker diarisation

The objective of this work is effective speaker diarisation using multi-...

0 Youngki Kwon, et al. ∙

research

∙ 10/06/2021

Spell my name: keyword boosted speech recognition

Recognition of uncommon words such as names and technical terminology is...

0 Namkyu Jung, et al. ∙

research

∙ 10/04/2021

AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

Artefacts that differentiate spoofed from bona-fide utterances can resid...

0 Jee-weon Jung, et al. ∙

research

∙ 08/17/2021

Look Who's Talking: Active Speaker Detection in the Wild

In this work, we present a novel audio-visual dataset for active speaker...

0 You Jin Kim, et al. ∙

research

∙ 04/07/2021

Adapting Speaker Embeddings for Speaker Diarisation

The goal of this paper is to adapt speaker embeddings for solving the pr...

0 Youngki Kwon, et al. ∙

research

∙ 04/07/2021

Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network

In this work, we propose an overlapped speech detection system trained a...

0 Jee-weon Jung, et al. ∙

research

∙ 11/30/2020

Look who's not talking

The objective of this work is speaker diarisation of speech recordings '...

0 Youngki Kwon, et al. ∙

research

∙ 11/10/2020

Supervised attention for speaker recognition

The recently proposed self-attentive pooling (SAP) has shown good perfor...

0 Seong Min Kye, et al. ∙

research

∙ 10/29/2020

The ins and outs of speaker recognition: lessons from VoxSRC 2020

The VoxCeleb Speaker Recognition Challenge (VoxSRC) at Interspeech 2020 ...

0 Yoohwan Kwon, et al. ∙

research

∙ 10/22/2020

Graph Attention Networks for Speaker Verification

This work presents a novel back-end framework for speaker verification u...

0 Jee-weon Jung, et al. ∙

research

∙ 09/29/2020

Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020

This report describes our submission to the VoxCeleb Speaker Recognition...

0 Hee-Soo Heo, et al. ∙

research

∙ 08/13/2020

Cross attentive pooling for speaker verification

The goal of this paper is text-independent speaker verification where ut...

0 Seong Min Kye, et al. ∙

research

∙ 08/10/2020

Self-Supervised Learning of Audio-Visual Objects from Video

Our objective is to transform a video into a set of discrete audio-visua...

7 Triantafyllos Afouras, et al. ∙

research

∙ 07/23/2020

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues

Recent progress in fine-grained gesture and action classification, and m...

2 Samuel Albanie, et al. ∙

research

∙ 07/23/2020

Augmentation adversarial training for unsupervised speaker recognition

The goal of this work is to train robust speaker recognition models with...

0 Jaesung Huh, et al. ∙

research

∙ 07/02/2020

Spot the conversation: speaker diarisation in the wild

The goal of this paper is speaker diarisation of videos collected 'in th...

2 Joon Son Chung, et al. ∙

research

∙ 05/18/2020

Metric Learning for Keyword Spotting

The goal of this work is to train effective representations for keyword ...

0 Jaesung Huh, et al. ∙

research

∙ 05/14/2020

FaceFilter: Audio-visual speech separation using still images

The objective of this paper is to separate a target speaker's speech fro...

9 Soo-Whan Chung, et al. ∙

research

∙ 04/29/2020

Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision

The goal of this work is to train discriminative cross-modal embeddings ...

11 Soo-Whan Chung, et al. ∙

research

∙ 03/26/2020

In defence of metric learning for speaker recognition

The objective of this paper is 'open-set' speaker recognition of unseen ...

0 Joon Son Chung, et al. ∙

research

∙ 02/20/2020

Disentangled Speech Embeddings using Cross-modal Self-supervision

The objective of this paper is to learn representations of speaker ident...

17 Arsha Nagrani, et al. ∙

research

∙ 12/05/2019

VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well...

5 Joon Son Chung, et al. ∙

research

∙ 11/28/2019

ASR is all you need: cross-modal distillation for lip reading

The goal of this work is to train strong models for visual speech recogn...

24 Triantafyllos Afouras, et al. ∙

research

∙ 11/06/2019

The sound of my voice: speaker representation loss for target voice separation

Research on content and style representations has been widely studied in...

0 Seongkyu Mun, et al. ∙

research

∙ 10/24/2019

Delving into VoxCeleb: environment invariant speaker recognition

Research in speaker recognition has recently seen significant progress d...

0 Joon Son Chung, et al. ∙

research

∙ 07/11/2019

My lips are concealed: Audio-visual speech enhancement through obstructions

Our objective is an audio-visual model for separating a single speaker f...

3 Triantafyllos Afouras, et al. ∙

research

∙ 06/25/2019

Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

This report describes our submission to the ActivityNet Challenge at CVP...

2 Joon Son Chung, et al. ∙

research

∙ 06/24/2019

Who said that?: Audio-visual speaker diarisation of real-world meetings

The goal of this work is to determine 'who spoke when' in real-world mee...

1 Joon Son Chung, et al. ∙

research

∙ 02/26/2019

Utterance-level Aggregation For Speaker Recognition In The Wild

The objective of this paper is speaker recognition "in the wild"-where u...

2 Weidi Xie, et al. ∙

research

∙ 09/21/2018

Perfect match: Improved cross-modal embeddings for audio-visual synchronisation

This paper proposes a new strategy for learning powerful cross-modal emb...

0 Soo-Whan Chung, et al. ∙

research

∙ 09/06/2018

Deep Audio-Visual Speech Recognition

The goal of this work is to recognise phrases and sentences being spoken...

2 Triantafyllos Afouras, et al. ∙

research

∙ 09/03/2018

LRS3-TED: a large-scale dataset for visual speech recognition

This paper introduces a new multi-modal dataset for visual and audio-vis...

0 Triantafyllos Afouras, et al. ∙

research

∙ 06/15/2018

Deep Lip Reading: a comparison of models and an online application

The goal of this paper is to develop state-of-the-art models for lip rea...

4 Triantafyllos Afouras, et al. ∙

Joon Son Chung

Featured Co-authors

Sign in with Google

Consider DeepAI Pro