Interpreting A Pre-trained Model Is A Key For Model Architecture Optimization: A Case Study On Wav2Vec 2.0

04/07/2021
by   Liu Chen, et al.
0

A deep Transformer model with good evaluation score does not mean each subnetwork (a.k.a transformer block) learns reasonable representation. Diagnosing abnormal representation and avoiding it can contribute to achieving a better evaluation score. We propose an innovative perspective for analyzing attention patterns: summarize block-level patterns and assume abnormal patterns contribute negative influence. We leverage Wav2Vec 2.0 as a research target and analyze a pre-trained model's pattern. All experiments leverage Librispeech-100-clean as training data. Through avoiding diagnosed abnormal ones, our custom Wav2Vec 2.0 outperforms the original version about 4.8 absolute word error rate (WER) on test-clean with viterbi decoding. Our version is still 0.9 identify that avoiding abnormal patterns is the main contributor for performance boosting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2020

Pre-Trained Image Processing Transformer

As the computing power of modern hardware is increasing strongly, pre-tr...
research
06/14/2022

Turning a Curse Into a Blessing: Enabling Clean-Data-Free Defenses by Model Inversion

It is becoming increasingly common to utilize pre-trained models provide...
research
12/08/2021

Transformaly – Two (Feature Spaces) Are Better Than One

Anomaly detection is a well-established research area that seeks to iden...
research
07/17/2023

An Empirical Investigation of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration

In the realm of out-of-distribution generalization tasks, finetuning has...
research
05/22/2023

Bidirectional Transformer Reranker for Grammatical Error Correction

Pre-trained seq2seq models have achieved state-of-the-art results in the...
research
05/26/2021

Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction

In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-tra...
research
10/31/2017

Abnormal Spatial-Temporal Pattern Analysis for Niagara Frontier Border Wait Times

Border crossing delays cause problems like huge economics loss and heavy...

Please sign up or login with your details

Forgot password? Click here to reset