DeepAI AI Chat
Log In Sign Up

Improving BERT with Self-Supervised Attention

by   Xiaoyu Kou, et al.
ETH Zurich
Peking University

One of the most popular paradigms of applying large, pre-trained NLP models such as BERT is to fine-tune it on a smaller dataset. However, one challenge remains as the fine-tuned model often overfits on smaller datasets. A symptom of this phenomenon is that irrelevant words in the sentences, even when they are obvious to humans, can substantially degrade the performance of these fine-tuned BERT models. In this paper, we propose a novel technique, called Self-Supervised Attention (SSA) to help facilitate this generalization challenge. Specifically, SSA automatically generates weak, token-level attention labels iteratively by "probing" the fine-tuned model from the previous iteration. We investigate two different ways of integrating SSA into BERT and propose a hybrid approach to combine their benefits. Empirically, on a variety of public datasets, we illustrate significant performance improvement using our SSA-enhanced BERT model.


page 1

page 2

page 3

page 4


When BERT Plays the Lottery, All Tickets Are Winning

Much of the recent success in NLP is due to the large Transformer-based ...

PromptBERT: Improving BERT Sentence Embeddings with Prompts

The poor performance of the original BERT for sentence semantic similari...

Revealing the Dark Secrets of BERT

BERT-based architectures currently give state-of-the-art performance on ...

CUED_speech at TREC 2020 Podcast Summarisation Track

In this paper, we describe our approach for the Podcast Summarisation ch...

Revisiting Supertagging for HPSG

We present new supertaggers trained on HPSG-based treebanks. These treeb...

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

Recently, ChatGPT has attracted great attention, as it can generate flue...

Maps Search Misspelling Detection Leveraging Domain-Augmented Contextual Representations

Building an independent misspelling detector and serve it before correct...