DocBERT: BERT for Document Classification

04/17/2019
by   Ashutosh Adhikari, et al.
0

Pre-trained language representation models achieve remarkable state of the art across a wide range of tasks in natural language processing. One of the latest advancements is BERT, a deep pre-trained transformer that yields much better results than its predecessors do. Despite its burgeoning popularity, however, BERT has not yet been applied to document classification. This task deserves attention, since it contains a few nuances: first, modeling syntactic structure matters less for document classification than for other problems, such as natural language inference and sentiment classification. Second, documents often have multiple labels across dozens of classes, which is uncharacteristic of the tasks that BERT explores. In this paper, we describe fine-tuning BERT for document classification. We are the first to demonstrate the success of BERT on this task, achieving state of the art across four popular datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
05/18/2019

BERTSel: Answer Selection with Pre-trained Models

Recently, pre-trained models have been the dominant paradigm in natural ...
research
08/01/2019

MSnet: A BERT-based Network for Gendered Pronoun Resolution

The pre-trained BERT model achieves a remarkable state of the art across...
research
08/19/2020

UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information

Pre-trained language model word representation, such as BERT, have been ...
research
10/31/2019

Multi-Stage Document Ranking with BERT

The advent of deep neural networks pre-trained via language modeling tas...
research
03/06/2020

Sensitive Data Detection and Classification in Spanish Clinical Text: Experiments with BERT

Massive digital data processing provides a wide range of opportunities a...
research
11/28/2020

Understanding How BERT Learns to Identify Edits

Pre-trained transformer language models such as BERT are ubiquitous in N...
research
10/12/2020

Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations

Although BERT is widely used by the NLP community, little is known about...

Please sign up or login with your details

Forgot password? Click here to reset