N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models

09/24/2020
by   Wanxiang Che, et al.
0

We introduce N-LTP, an open-source Python Chinese natural language processing toolkit supporting five basic tasks: Chinese word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and semantic dependency parsing. N-LTP adopts the multi-task framework with the pre-trained model to capture the shared knowledge across all Chinese relevant tasks. In addition, we propose to use knowledge distillation where single-task models teach a multi-task model, helping the multi-task model surpass its single-task teachers. Finally, we provide fundamental tasks API and a visualization tool to make users easier to use and view the processing results directly. To the best of our knowledge, this is the first toolkit to support all Chinese NLP fundamental tasks. Source code, documentation, and pre-trained models are available at https://github.com/HIT-SCIR/ltp.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2020

fastHan: A BERT-based Joint Many-Task Toolkit for Chinese NLP

We present fastHan, an open-source toolkit for four basic tasks in Chine...
research
01/05/2021

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

We present the first multi-task learning model – named PhoNLP – for join...
research
06/27/2019

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

Chinese word segmentation (CWS) is a fundamental step of Chinese natural...
research
03/16/2020

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

We introduce Stanza, an open-source Python natural language processing t...
research
06/09/2017

Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization

In this paper, we give an overview for the shared task at the CCF Confer...
research
02/14/2023

READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises

For many real-world applications, the user-generated inputs usually cont...
research
09/08/2021

ELIT: Emory Language and Information Toolkit

We introduce ELIT, the Emory Language and Information Toolkit, which is ...

Please sign up or login with your details

Forgot password? Click here to reset