BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

04/09/2021
by   Zhongju Wang, et al.
0

This paper proposes an automatic Chinese text categorization method for solving the emergency event report classification problem. Since bidirectional encoder representations from transformers (BERT) has achieved great success in natural language processing domain, it is employed to derive emergency text features in this study. To overcome the data imbalance problem in the distribution of emergency event categories, a novel loss function is proposed to improve the performance of the BERT-based model. Meanwhile, to avoid the impact of the extreme learning rate, the Adabound optimization algorithm that achieves a gradual smooth transition from Adam to SGD is employed to learn parameters of the model. To verify the feasibility and effectiveness of the proposed method, a Chinese emergency text dataset collected from the Internet is employed. Compared with benchmarking methods, the proposed method has achieved the best performance in terms of accuracy, weighted-precision, weighted-recall, and weighted-F1 values. Therefore, it is promising to employ the proposed method for real applications in smart emergency management systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2021

RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification

Bidirectional Encoder Representations from Transformers (BERT) have show...
research
11/06/2021

Profitable Trade-Off Between Memory and Performance In Multi-Domain Chatbot Architectures

Text classification problem is a very broad field of study in the field ...
research
10/28/2020

A Chinese Text Classification Method With Low Hardware Requirement Based on Improved Model Concatenation

In order to improve the accuracy performance of Chinese text classificat...
research
11/21/2022

TCBERT: A Technical Report for Chinese Topic Classification BERT

Bidirectional Encoder Representations from Transformers or BERT <cit.> h...
research
09/20/2019

BERT Meets Chinese Word Segmentation

Chinese word segmentation (CWS) is a fundamental task for Chinese langua...
research
05/02/2023

Cancer Hallmark Classification Using Bidirectional Encoder Representations From Transformers

This paper presents a novel approach to accurately classify the hallmark...
research
04/21/2023

Downstream Task-Oriented Neural Tokenizer Optimization with Vocabulary Restriction as Post Processing

This paper proposes a method to optimize tokenization for the performanc...

Please sign up or login with your details

Forgot password? Click here to reset