DeepAI AI Chat
Log In Sign Up

fastHan: A BERT-based Joint Many-Task Toolkit for Chinese NLP

by   Zhichao Geng, et al.

We present fastHan, an open-source toolkit for four basic tasks in Chinese natural language processing: Chinese word segmentation, Part-of-Speech tagging, named entity recognition, and dependency parsing. The kernel of fastHan is a joint many-task model based on a pruned BERT, which uses the first 8 layers in BERT. We also provide a 4-layer base version of model compressed from the 8-layer model. The joint-model is trained and evaluated in 13 corpora of four tasks, yielding near state-of-the-art (SOTA) performance in the dependency parsing task and SOTA performance in the other three tasks. In addition to its small size and excellent performance, fastHan is also very user-friendly. Implemented as a python package, fastHan allows users to easily download and use it. Users can get what they want with one line of code, even if they have little knowledge of deep learning. The project is released on Github.


page 1

page 2

page 3

page 4


N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models

We introduce N-LTP, an open-source Python Chinese natural language proce...

A More Efficient Chinese Named Entity Recognition base on BERT and Syntactic Analysis

We propose a new Named entity recognition (NER) method to effectively ma...

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter

Lexicon information and pre-trained models, such as BERT, have been comb...

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

We introduce Stanza, an open-source Python natural language processing t...

UniParse: A universal graph-based parsing toolkit

This paper describes the design and use of the graph-based parsing frame...

Building an Ellipsis-aware Chinese Dependency Treebank for Web Text

Web 2.0 has brought with it numerous user-produced data revealing one's ...