Yosuke Kashiwagi

research

∙ 09/16/2023

Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation

Collecting audio-text pairs is expensive; however, it is much easier to ...

0 Emiru Tsunoo, et al. ∙

research

∙ 07/24/2023

Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition

Although frame-based models, such as CTC and transducers, have an affini...

0 Emiru Tsunoo, et al. ∙

research

∙ 07/20/2023

Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding

There has been an increased interest in the integration of pretrained sp...

0 Siddhant Arora, et al. ∙

research

∙ 05/02/2023

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

Recently there have been efforts to introduce new benchmark tasks for sp...

0 Siddhant Arora, et al. ∙

research

∙ 05/02/2023

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

This paper describes our system for the low-resource domain adaptation t...

0 Hayato Futami, et al. ∙

research

∙ 11/16/2022

Streaming Joint Speech Recognition and Disfluency Detection

Disfluency detection has mainly been solved in a pipeline approach, as p...

0 Hayato Futami, et al. ∙

research

∙ 06/15/2022

Residual Language Model for End-to-end Speech Recognition

End-to-end automatic speech recognition suffers from adaptation to unkno...

0 Emiru Tsunoo, et al. ∙

research

∙ 02/03/2022

Joint Speech Recognition and Audio Captioning

Speech samples recorded in both indoor and outdoor environments are ofte...

0 Chaitanya Narisetty, et al. ∙

research

∙ 01/25/2022

Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR

A streaming style inference of encoder-decoder automatic speech recognit...

0 Emiru Tsunoo, et al. ∙

research

∙ 10/12/2021

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

A deep neural network (DNN)-based speech enhancement (SE) aiming to maxi...

0 Ryosuke Sawata, et al. ∙

research

∙ 06/07/2021

Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios

Although end-to-end automatic speech recognition (E2E ASR) has achieved ...

0 Emiru Tsunoo, et al. ∙

research

∙ 02/18/2021

Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition

Self-attention (SA) based models have recently achieved significant perf...

0 Yosuke Kashiwagi, et al. ∙

research

∙ 06/25/2020

Streaming Transformer ASR with Blockwise Synchronous Inference

The Transformer self-attention network has recently shown promising perf...

0 Emiru Tsunoo, et al. ∙

research

∙ 10/25/2019

Towards Online End-to-end Transformer Automatic Speech Recognition

The Transformer self-attention network has recently shown promising perf...

0 Emiru Tsunoo, et al. ∙

research

∙ 10/16/2019

Transformer ASR with Contextual Block Processing

The Transformer self-attention network has recently shown promising perf...

0 Emiru Tsunoo, et al. ∙

research

∙ 05/17/2019

End-to-end Adaptation with Backpropagation through WFST for On-device Speech Recognition System

An on-device DNN-HMM speech recognition system efficiently works with a ...

0 Emiru Tsunoo, et al. ∙

Yosuke Kashiwagi

Featured Co-authors

Sign in with Google

Consider DeepAI Pro