Learned Transferable Architectures Can Surpass Hand-Designed Architectures for Large Scale Speech Recognition

by   Liqiang He, et al.

In this paper, we explore the neural architecture search (NAS) for automatic speech recognition (ASR) systems. With reference to the previous works in the computer vision field, the transferability of the searched architecture is the main focus of our work. The architecture search is conducted on the small proxy dataset, and then the evaluation network, constructed with the searched architecture, is evaluated on the large dataset. Especially, we propose a revised search space for speech recognition tasks which theoretically facilitates the search algorithm to explore the architectures with low complexity. Extensive experiments show that: (i) the architecture searched on the small proxy dataset can be transferred to the large dataset for the speech recognition tasks. (ii) the architecture learned in the revised search space can greatly reduce the computational overhead and GPU memory usage with mild performance degradation. (iii) the searched architecture can achieve more than 20 on the AISHELL-2 dataset and the large (10k hours) dataset, compared with our best hand-designed DFSMN-SAN architecture. To the best of our knowledge, this is the first report of NAS results with large scale dataset (up to 10K hours), indicating the promising application of NAS to industrial ASR systems.



page 1

page 2

page 3

page 4


Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search

Recently neural architecture search(NAS) has been successfully used in i...

Latency-Controlled Neural Architecture Search for Streaming Speech Recognition

Recently, neural architecture search (NAS) has attracted much attention ...

NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search

Neural Architecture Search (NAS) is an open and challenging problem in m...

Neural Architecture Search for Speech Recognition

Deep neural networks (DNNs) based automatic speech recognition (ASR) sys...

RSBNet: One-Shot Neural Architecture Search for A Backbone Network in Remote Sensing Image Recognition

Recently, a massive number of deep learning based approaches have been s...

Learning Architectures from an Extended Search Space for Language Modeling

Neural architecture search (NAS) has advanced significantly in recent ye...

DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

In previous works, only parameter weights of ASR models are optimized un...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.