Conformer LLMs – Convolution Augmented Large Language Models

07/02/2023
by   Prateek Verma, et al.
0

This work builds together two popular blocks of neural architecture, namely convolutional layers and Transformers, for large language models (LLMs). Non-causal conformers are used ubiquitously in automatic speech recognition. This work aims to adapt these architectures in a causal setup for training LLMs. Transformers decoders effectively capture long-range dependencies over several modalities and form a core backbone of modern advancements in machine learning. Convolutional architectures have been popular in extracting features in domains such as raw 1-D signals, speech, and images, to name a few. In this paper, by combining local and global dependencies over latent representations using causal convolutional filters and Transformer, we achieve significant gains in performance. This work showcases a robust speech architecture that can be integrated and adapted in a causal setup beyond speech applications for large-scale language modeling.

READ FULL TEXT
research
05/16/2020

Conformer: Convolution-augmented Transformer for Speech Recognition

Recently Transformer and Convolution neural network (CNN) based models h...
research
10/24/2019

An Empirical Study of Efficient ASR Rescoring with Transformers

Neural language models (LMs) have been proved to significantly outperfor...
research
03/21/2023

Transformers in Speech Processing: A Survey

The remarkable success of transformers in the field of natural language ...
research
05/22/2023

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition

Energy-based language models (ELMs) parameterize an unnormalized distrib...
research
10/20/2021

Shaking the foundations: delusions in sequence models for interaction and control

The recent phenomenal success of language models has reinvigorated machi...
research
07/06/2022

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Conformer has proven to be effective in many speech processing tasks. It...
research
07/20/2022

Action Quality Assessment using Transformers

Action quality assessment (AQA) is an active research problem in video-b...

Please sign up or login with your details

Forgot password? Click here to reset