Ensemble ALBERT on SQuAD 2.0

10/19/2021
by   Shilun Li, et al.
0

Machine question answering is an essential yet challenging task in natural language processing. Recently, Pre-trained Contextual Embeddings (PCE) models like Bidirectional Encoder Representations from Transformers (BERT) and A Lite BERT (ALBERT) have attracted lots of attention due to their great performance in a wide range of NLP tasks. In our Paper, we utilized the fine-tuned ALBERT models and implemented combinations of additional layers (e.g. attention layer, RNN layer) on top of them to improve model performance on Stanford Question Answering Dataset (SQuAD 2.0). We implemented four different models with different layers on top of ALBERT-base model, and two other models based on ALBERT-xlarge and ALBERT-xxlarge. We compared their performance to our baseline model ALBERT-base-v2 + ALBERT-SQuAD-out with details. Our best-performing individual model is ALBERT-xxlarge + ALBERT-SQuAD-out, which achieved an F1 score of 88.435 on the dev set. Furthermore, we have implemented three different ensemble algorithms to boost overall performance. By passing in several best-performing models' results into our weighted voting ensemble algorithm, our final result ranks first on the Stanford CS224N Test PCE SQuAD Leaderboard with F1 = 90.123.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stan...
research
10/14/2019

Whatcha lookin' at? DeepLIFTing BERT's Attention in Question Answering

There has been great success recently in tackling challenging NLP tasks ...
research
05/12/2021

Building a Question and Answer System for News Domain

This project attempts to build a Question- Answering system in the News ...
research
04/07/2022

PALBERT: Teaching ALBERT to Ponder

Currently, pre-trained models can be considered the default choice for a...
research
12/14/2019

BERTQA – Attention on Steroids

In this work, we extend the Bidirectional Encoder Representations from T...
research
02/25/2020

Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0

In this paper we explore the parameter efficiency of BERT arXiv:1810.048...
research
09/17/2020

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Many NLP tasks have benefited from transferring knowledge from contextua...

Please sign up or login with your details

Forgot password? Click here to reset