Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

11/20/2018
by   Pranav A, et al.
0

Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at: https://github.com/pranav-ust/cognates

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/04/2023

InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval

Recently, InPars introduced a method to efficiently use large language m...
research
03/14/2018

Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets

With the rapid advance of the Internet, search engines (e.g., Google, Bi...
research
02/06/2019

A Comparison of Information Retrieval Techniques for Detecting Source Code Plagiarism

Plagiarism is a commonly encountered problem in the academia. While ther...
research
11/25/2021

Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

Heavily pre-trained transformers for language modelling, such as BERT, h...
research
06/17/2023

Typo-Robust Representation Learning for Dense Retrieval

Dense retrieval is a basic building block of information retrieval appli...
research
01/05/2017

Exploration of Proximity Heuristics in Length Normalization

Ranking functions used in information retrieval are primarily used in th...

Please sign up or login with your details

Forgot password? Click here to reset