HUBERT Untangles BERT to Improve Transfer across NLP Tasks

10/25/2019
by   Mehrad Moradshahi, et al.
0

We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP datasets that HUBERT, but not BERT, is able to learn and leverage. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. Our experiment results show that untangling data-specific semantics from general language structure is key for better transfer among NLP tasks.

READ FULL TEXT
research
12/19/2019

BERTje: A Dutch BERT Model

The transformer-based pre-trained language model BERT has helped to impr...
research
09/13/2020

BoostingBERT:Integrating Multi-Class Boosting into BERT for NLP Tasks

As a pre-trained Transformer model, BERT (Bidirectional Encoder Represen...
research
06/06/2021

Transient Chaos in BERT

Language is an outcome of our complex and dynamic human-interactions and...
research
07/14/2021

Large-Scale News Classification using BERT Language Model: Spark NLP Approach

The rise of big data analytics on top of NLP increases the computational...
research
05/26/2020

Comparing BERT against traditional machine learning text classification

The BERT model has arisen as a popular state-of-the-art machine learning...
research
04/23/2023

Exploring Challenges of Deploying BERT-based NLP Models in Resource-Constrained Embedded Devices

BERT-based neural architectures have established themselves as popular s...
research
08/13/2019

Domain Adaptive Training BERT for Response Selection

We focus on multi-turn response selection in a retrieval-based dialog sy...

Please sign up or login with your details

Forgot password? Click here to reset