In conventional studies on environmental sound separation and synthesis ...
Large-scale pretrained models using self-supervised learning have report...
Due to the high performance of multi-channel speech processing, we can u...
A method to perform offline and online speaker diarization for an unlimi...
We propose a fundamental theory on ensemble learning that evaluates a gi...
This paper investigates a method for simulating natural conversation in ...
Onomatopoeia, which is a character sequence that phonetically imitates a...
Recent progress on end-to-end neural diarization (EEND) has enabled
over...
Attractor-based end-to-end diarization is achieving comparable accuracy ...
This paper investigates an end-to-end neural diarization (EEND) method f...
In this paper, we present a semi-supervised training technique using
pse...
In this paper, we present a conditional multitask learning method for
en...
This paper provides a detailed description of the Hitachi-JHU system tha...
This paper proposes an online end-to-end diarization that can handle
ove...
This paper investigates the utilization of an end-to-end diarization mod...
We propose a block-online algorithm of guided source separation (GSS). G...
A novel framework for meeting transcription using asynchronous microphon...
End-to-end speaker diarization using a fully supervised self-attention
m...
Speaker diarization is an essential step for processing multi-speaker au...
End-to-end speaker diarization for an unknown number of speakers is addr...
The most common approach to speaker diarization is clustering of speaker...
This paper investigates the use of target-speaker automatic speech
recog...
Speaker diarization has been mainly developed based on the clustering of...
In this paper, we propose a novel end-to-end neural-network-based speake...
In this paper, we propose a novel auxiliary loss function for target-spe...
In this paper, we present Hitachi and Paderborn University's joint effor...
Currently, food image recognition tasks are evaluated against fixed data...
The extraction of useful deep features is important for many computer vi...