DeepAI AI Chat
Log In Sign Up

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

by   Zhiheng Huang, et al.

Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate how these two modeling techniques can be combined to create a more powerful model architecture. We propose a new architecture denoted as Transformer with BLSTM (TRANS-BLSTM) which has a BLSTM layer integrated to each transformer block, leading to a joint modeling framework for transformer and BLSTM. We show that TRANS-BLSTM models consistently lead to improvements in accuracy compared to BERT baselines in GLUE and SQuAD 1.1 experiments. Our TRANS-BLSTM model obtains an F1 score of 94.01 state-of-the-art result.


page 1

page 2

page 3

page 4


BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stan...

Finnish Language Modeling with Deep Transformer Models

Transformers have recently taken the center stage in language modeling a...

Reducing Transformer Depth on Demand with Structured Dropout

Overparameterized transformer networks have obtained state of the art re...

Building a Question and Answer System for News Domain

This project attempts to build a Question- Answering system in the News ...

Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel

Transformer is a powerful architecture that achieves superior performanc...

Support-BERT: Predicting Quality of Question-Answer Pairs in MSDN using Deep Bidirectional Transformer

Quality of questions and answers from community support websites (e.g. M...

VisBERT: Hidden-State Visualizations for Transformers

Explainability and interpretability are two important concepts, the abse...