Measuring Sentences Similarity: A Survey

10/06/2019
by   Mamdouh Farouk, et al.
0

This study is to review the approaches used for measuring sentences similarity. Measuring similarity between natural language sentences is a crucial task for many Natural Language Processing applications such as text classification, information retrieval, question answering, and plagiarism detection. This survey classifies approaches of calculating sentences similarity based on the adopted methodology into three categories. Word-to-word based, structure based, and vector-based are the most widely used approaches to find sentences similarity. Each approach measures relatedness between short texts based on a specific perspective. In addition, datasets that are mostly used as benchmarks for evaluating techniques in this field are introduced to provide a complete view on this issue. The approaches that combine more than one perspective give better results. Moreover, structure based similarity that measures similarity between sentences structures needs more investigation.

READ FULL TEXT
research
05/03/2021

A novel hybrid methodology of measuring sentence similarity

The problem of measuring sentence similarity is an essential issue in th...
research
02/17/2016

A Comprehensive Comparative Study of Word and Sentence Similarity Measures

Sentence similarity is considered the basis of many natural language tas...
research
05/24/2023

CSTS: Conditional Semantic Textual Similarity

Semantic textual similarity (STS) has been a cornerstone task in NLP tha...
research
10/25/2022

Similarity between Units of Natural Language: The Transition from Coarse to Fine Estimation

Capturing the similarities between human language units is crucial for e...
research
04/19/2020

Evolution of Semantic Similarity – A Survey

Estimating the semantic similarity between text data is one of the chall...
research
11/29/2022

Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

The ability to compare the semantic similarity between text corpora is i...
research
12/11/2019

CoSimLex: A Resource for Evaluating Graded Word Similarity in Context

State of the art natural language processing tools are built on context-...

Please sign up or login with your details

Forgot password? Click here to reset