WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit

02/02/2021
by   BinBin Zhang, et al.
0

In this paper, we present a new open source, production first and production ready end-to-end (E2E) speech recognition toolkit named WeNet. The main motivation of WeNet is to close the gap between the research and the production of E2E speech recognition models. WeNet provides an efficient way to ship ASR applications in several real-world scenarios, which is the main difference and advantage to other open source E2E speech recognition toolkits. This paper introduces WeNet from three aspects, including model architecture, framework design and performance metrics. Our experiments on AISHELL-1 using WeNet, not only give a promising character error rate (CER) on a unified streaming and non-streaming two pass (U2) E2E model but also show reasonable RTF and latency, both of these aspects are favored for production adoption. The toolkit is publicly available at https://github.com/mobvoi/wenet.

READ FULL TEXT
research
03/30/2018

ESPnet: End-to-End Speech Processing Toolkit

This paper introduces a new open source platform for end-to-end speech p...
research
03/29/2022

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

Recently, we made available WeNet, a production-oriented end-to-end spee...
research
10/30/2022

WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

Keyword spotting (KWS) enables speech-based user interaction and gradual...
research
05/11/2017

Reducing Bias in Production Speech Models

Replacing hand-engineered pipelines with end-to-end deep learning system...
research
04/20/2020

ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric

Estimation of perceptual quality in audio and speech is possible using a...
research
05/18/2023

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

This paper introduces FunASR, an open-source speech recognition toolkit ...
research
05/27/2020

CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency

In this paper, we present a new open source toolkit for speech recogniti...

Please sign up or login with your details

Forgot password? Click here to reset