TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis

12/31/2020
by   Haisong Zhang, et al.
10

This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities. Compared to most previous publicly available text understanding systems and tools, TexSmart holds some unique features. First, the NER function of TexSmart supports over 1,000 entity types, while most other public tools typically support several to (at most) dozens of entity types. Second, TexSmart introduces new semantic analysis functions like semantic expansion and deep semantic representation, that are absent in most previous systems. Third, a spectrum of algorithms (from very fast algorithms to those that are relatively slow but more accurate) are implemented for one function in TexSmart, to fulfill the requirements of different academic and industrial applications. The adoption of unsupervised or weakly-supervised algorithms is especially emphasized, with the goal of easily updating our models to include fresh data with less human annotation efforts. The main contents of this report include major functions of TexSmart, algorithms for achieving these functions, how to use the TexSmart toolkit and Web APIs, and evaluation results of some key algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2021

Few-NERD: A Few-Shot Named Entity Recognition Dataset

Recently, considerable literature has grown up around the theme of few-s...
research
02/08/2017

Automatically Annotated Turkish Corpus for Named Entity Recognition and Text Categorization using Large-Scale Gazetteers

Turkish Wikipedia Named-Entity Recognition and Text Categorization (TWNE...
research
09/15/2020

Cascaded Models for Better Fine-Grained Named Entity Recognition

Named Entity Recognition (NER) is an essential precursor task for many n...
research
12/16/2021

Simple Questions Generate Named Entity Recognition Datasets

Named entity recognition (NER) is a task of extracting named entities of...
research
01/13/2020

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...
research
06/12/2017

Semantic Entity Retrieval Toolkit

Unsupervised learning of low-dimensional, semantic representations of wo...
research
01/13/2020

CLUENER2020: Fine-grained Name Entity Recognition for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...

Please sign up or login with your details

Forgot password? Click here to reset