Highly Fast Text Segmentation With Pairwise Markov Chains

by   Elie Azeraf, et al.

Natural Language Processing (NLP) models' current trend consists of using increasingly more extra-data to build the best models as possible. It implies more expensive computational costs and training time, difficulties for deployment, and worries about these models' carbon footprint reveal a critical problem in the future. Against this trend, our goal is to develop NLP models requiring no extra-data and minimizing training time. To do so, in this paper, we explore Markov chain models, Hidden Markov Chain (HMC) and Pairwise Markov Chain (PMC), for NLP segmentation tasks. We apply these models for three classic applications: POS Tagging, Named-Entity-Recognition, and Chunking. We develop an original method to adapt these models for text segmentation's specific challenges to obtain relevant performances with very short training and execution times. PMC achieves equivalent results to those obtained by Conditional Random Fields (CRF), one of the most applied models for these tasks when no extra-data are used. Moreover, PMC has training times 30 times shorter than the CRF ones, which validates this model given our objectives.



There are no comments yet.


page 1

page 2

page 3

page 4


On equivalence between linear-chain conditional random fields and hidden Markov chains

Practitioners successfully use hidden Markov chains (HMCs) in different ...

Introducing the Hidden Neural Markov Chain framework

Nowadays, neural network models achieve state-of-the-art results in many...

Connecting Distant Entities with Induction through Conditional Random Fields for Named Entity Recognition: Precursor-Induced CRF

This paper presents a method of designing specific high-order dependency...

Decoding with Finite-State Transducers on GPUs

Weighted finite automata and transducers (including hidden Markov models...

Exploring Segment Representations for Neural Segmentation Models

Many natural language processing (NLP) tasks can be generalized into seg...

Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging

The ability to take into account the characteristics - also called featu...

Bayesian Structured Prediction Using Gaussian Processes

We introduce a conceptually novel structured prediction model, GPstruct,...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.