Attention Is All You Need for Chinese Word Segmentation

10/31/2019
by   Sufeng Duan, et al.
0

This paper presents a fast and accurate Chinese word segmentation (CWS) model with only unigram feature and greedy decoding algorithm. Our model uses only attention mechanism for network block building. In detail, we adopt a Transformer-based encoder empowered by self-attention mechanism as backbone to take input representation. Then we extend the Transformer encoder with our proposed Gaussian-masked directional multi-head attention, which is a variant of scaled dot-product attention. At last, a bi-affinal attention scorer is to make segmentation decision in a linear time. Our model is evaluated on SIGHAN Bakeoff benchmark dataset. The experimental results show that with the highest segmentation speed, the proposed attention-only model achieves new state-of-the-art or comparable performance against strong baselines in terms of closed test setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2018

Fast Directional Self-Attention Mechanism

In this paper, we propose a self-attention mechanism, dubbed "fast direc...
research
05/16/2019

Incorporating Sememes into Chinese Definition Modeling

Chinese definition modeling is a challenging task that generates a dicti...
research
10/15/2019

Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving

We incorporate Tensor-Product Representations within the Transformer in ...
research
10/16/2018

Semantic Aware Attention Based Deep Object Co-segmentation

Object co-segmentation is the task of segmenting the same objects from m...
research
05/12/2022

Supplementary Material: Implementation and Experiments for GAU-based Model

In February this year Google proposed a new Transformer variant called F...
research
08/21/2022

DPTNet: A Dual-Path Transformer Architecture for Scene Text Detection

The prosperity of deep learning contributes to the rapid progress in sce...
research
10/15/2019

Language Identification on Massive Datasets of Short Message using an Attention Mechanism CNN

Language Identification (LID) is a challenging task, especially when the...

Please sign up or login with your details

Forgot password? Click here to reset