DeepAI AI Chat
Log In Sign Up

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

by   Pranav A, et al.

Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at:


page 1

page 2

page 3

page 4


InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval

Recently, InPars introduced a method to efficiently use large language m...

Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets

With the rapid advance of the Internet, search engines (e.g., Google, Bi...

A Comparison of Information Retrieval Techniques for Detecting Source Code Plagiarism

Plagiarism is a commonly encountered problem in the academia. While ther...

Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

Heavily pre-trained transformers for language modelling, such as BERT, h...

Exploration of Proximity Heuristics in Length Normalization

Ranking functions used in information retrieval are primarily used in th...

FrOoDo: Framework for Out-of-Distribution Detection

FrOoDo is an easy-to-use and flexible framework for Out-of-Distribution ...