We introduce a multilingual speaker change detection model (USM-SCD) tha...
Although numerous clustering algorithms have been developed, many existi...
We develop scalable randomized kernel methods for jointly associating da...
Given the large-scale data and the high annotation cost,
pretraining-fin...
Technological advances have enabled the generation of unique and
complem...
In this work we propose a novel token-based training strategy that impro...
While recent research advances in speaker diarization mostly focus on
im...
This paper introduces contrastive siamese (c-siam) network, an architect...
Predictive learning ideally builds the world model of physical processes...
Combinatorial optimization (CO) is a long-standing challenging task not ...
In the context of Bayesian inversion for scientific and engineering mode...
In this paper, we present a novel speaker diarization system for streami...
Computer systems such as storage systems normally require transparent
wh...
Reducing prediction delay for streaming end-to-end ASR models with minim...
In this paper we present a Transformer-Transducer model architecture and...
Human-Object Interaction (HOI) detection lies at the core of action
unde...
In this paper we present an end-to-end speech recognition model with
Tra...
Homographs, words with different meanings but the same surface form, hav...
Previous work has modeled the compositionality of words by creating
char...