Multi-Criteria Chinese Word Segmentation with Transformer

06/28/2019
by   Xipeng Qiu, et al.
0

Different linguistic perspectives cause many diverse segmentation criteria for Chinese word segmentation (CWS). Most existing methods focus on improving the performance of single-criterion CWS. However, it is interesting to exploit these heterogeneous segmentation criteria and mine their common underlying knowledge. In this paper, we propose a concise and effective model for multi-criteria CWS, which utilizes a shared fully-connected self-attention model to segment the sentence according to a criterion indicator. Experiments on eight datasets with heterogeneous segmentation criteria show that the performance of each corpus obtains a significant improvement, compared to single-criterion learning.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset