Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models

04/15/2021
by   Yuxuan Lai, et al.
0

Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese – Lattice-BERT, which explicitly incorporates word representations along with characters, thus can model a sentence in a multi-granularity manner. Specifically, we construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers. We design a lattice position attention mechanism to exploit the lattice structures in self-attention layers. We further propose a masked segment prediction task to push the model to learn from rich but redundant information inherent in lattices, while avoiding learning unexpected tricks. Experiments on 11 Chinese natural language understanding tasks show that our model can bring an average increase of 1.5 setting, which achieves new state-of-the-art among base-size models on the CLUE benchmarks. Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations. Our code will be available at https://github.com/alibaba/pretrained-language-models/LatticeBERT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2023

Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models

Pretrained language models (PLMs) have shown marvelous improvements acro...
research
02/25/2021

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

Chinese short text matching is a fundamental task in natural language pr...
research
03/23/2023

Retrieval-Augmented Classification with Decoupled Representation

Pretrained language models (PLMs) have shown marvelous improvements acro...
research
02/25/2019

Lattice CNNs for Matching Based Chinese Question Answering

Short text matching often faces the challenges that there are great word...
research
07/06/2020

Learning Spoken Language Representations with Neural Lattice Language Modeling

Pre-trained language models have achieved huge improvement on many NLP t...
research
10/14/2021

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

Pre-trained language models (PLMs), such as BERT and GPT, have revolutio...
research
10/20/2022

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

Chinese spelling check (CSC) is a fundamental NLP task that detects and ...

Please sign up or login with your details

Forgot password? Click here to reset