FlexiTerm: A more efficient implementation of flexible multi-word term recognition

10/13/2021
by   Irena Spasić, et al.
0

Terms are linguistic signifiers of domain-specific concepts. Automated recognition of multi-word terms (MWT) in free text is a sequence labelling problem, which is commonly addressed using supervised machine learning methods. Their need for manual annotation of training data makes it difficult to port such methods across domains. FlexiTerm, on the other hand, is a fully unsupervised method for MWT recognition from domain-specific corpora. Originally implemented in Java as a proof of concept, it did not scale well, thus offering little practical value in the context of big data. In this paper, we describe its re-implementation in Python and compare the performance of these two implementations. The results demonstrated major improvements in terms of efficiency, which allow FlexiTerm to transition from the proof of concept to the production-grade application.

READ FULL TEXT
research
05/24/2023

A Distributed Automatic Domain-Specific Multi-Word Term Recognition Architecture using Spark Ecosystem

Automatic Term Recognition is used to extract domain-specific terms that...
research
07/17/2019

Differentiable Disentanglement Filter: an Application Agnostic Core Concept Discovery Probe

It has long been speculated that deep neural networks function by discov...
research
02/24/2016

A Survey on Domain-Specific Languages for Machine Learning in Big Data

The amount of data generated in the modern society is increasing rapidly...
research
10/01/2019

Essentia: Mining Domain-specific Paraphrases with Word-Alignment Graphs

Paraphrases are important linguistic resources for a wide variety of NLP...
research
05/05/2022

Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation

Open-domain conversational systems are assumed to generate equally good ...
research
04/30/2019

FastContext: an efficient and scalable implementation of the ConText algorithm

Objective: To develop and evaluate FastContext, an efficient, scalable i...
research
03/29/2018

Proof-of-Concept Examples of Performance-Transparent Programming Models

Machine-specific optimizations command the machine to behave in a specif...

Please sign up or login with your details

Forgot password? Click here to reset