N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models

09/24/2020
by   Wanxiang Che, et al.
0

We introduce N-LTP, an open-source Python Chinese natural language processing toolkit supporting five basic tasks: Chinese word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and semantic dependency parsing. N-LTP adopts the multi-task framework with the pre-trained model to capture the shared knowledge across all Chinese relevant tasks. In addition, we propose to use knowledge distillation where single-task models teach a multi-task model, helping the multi-task model surpass its single-task teachers. Finally, we provide fundamental tasks API and a visualization tool to make users easier to use and view the processing results directly. To the best of our knowledge, this is the first toolkit to support all Chinese NLP fundamental tasks. Source code, documentation, and pre-trained models are available at https://github.com/HIT-SCIR/ltp.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/18/2020

fastHan: A BERT-based Joint Many-Task Toolkit for Chinese NLP

We present fastHan, an open-source toolkit for four basic tasks in Chine...
01/05/2021

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

We present the first multi-task learning model – named PhoNLP – for join...
06/27/2019

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

Chinese word segmentation (CWS) is a fundamental step of Chinese natural...
06/09/2017

Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization

In this paper, we give an overview for the shared task at the CCF Confer...
03/16/2020

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

We introduce Stanza, an open-source Python natural language processing t...
09/08/2021

ELIT: Emory Language and Information Toolkit

We introduce ELIT, the Emory Language and Information Toolkit, which is ...
01/09/2021

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

We introduce Trankit, a light-weight Transformer-based Toolkit for multi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.