Hybrid lemmatization in HuSpaCy

06/13/2023
by   Péter Berkecz, et al.
0

Lemmatization is still not a trivial task for morphologically rich languages. Previous studies showed that hybrid architectures usually work better for these languages and can yield great results. This paper presents a hybrid lemmatizer utilizing both a neural model, dictionaries and hand-crafted rules. We introduce a hybrid architecture along with empirical results on a widely used Hungarian dataset. The presented methods are published as three HuSpaCy models.

READ FULL TEXT
research
03/31/2020

Axiomatizing Hybrid XPath with Data

In this paper we introduce sound and strongly complete axiomatizations f...
research
12/20/2022

HYRR: Hybrid Infused Reranking for Passage Retrieval

We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a fra...
research
11/11/2019

A hybrid text normalization system using multi-head self-attention for mandarin

In this paper, we propose a hybrid text normalization system using multi...
research
09/15/2021

Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification

Traditional hand-crafted linguistically-informed features have often bee...
research
03/11/2019

The Unconstrained Ear Recognition Challenge 2019

This paper presents a summary of the 2019 Unconstrained Ear Recognition ...
research
07/22/2022

A Hybrid Numerical Algorithm for Evaluating n-th Order Tridiagonal Determinants

The principal minors of a tridiagonal matrix satisfy two-term and three-...
research
08/15/2020

Experimental investigations of psychoacoustic characteristics of household vacuum cleaners

Vacuum cleaners are one of the most widely used household appliances ass...

Please sign up or login with your details

Forgot password? Click here to reset