What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study

10/10/2021
by   Mohamed Abdalla, et al.
0

The degree of semantic relatedness (or, closeness in meaning) of two units of language has long been considered fundamental to understanding meaning. Automatically determining relatedness has many applications such as question answering and summarization. However, prior NLP work has largely focused on semantic similarity (a subset of relatedness), because of a lack of relatedness datasets. Here for the first time, we introduce a dataset of semantic relatedness for sentence pairs. This dataset, STR-2021, has 5,500 English sentence pairs manually annotated for semantic relatedness using a comparative annotation framework. We show that the resulting scores have high reliability (repeat annotation correlation of 0.84). We use the dataset to explore a number of questions on what makes two sentences more semantically related. We also evaluate a suite of sentence representation methods on their ability to place pairs that are more related closer to each other in vector space.

READ FULL TEXT
research
07/05/2020

CORD19STS: COVID-19 Semantic Textual Similarity Dataset

In order to combat the COVID-19 pandemic, society can benefit from vario...
research
02/17/2016

A Comprehensive Comparative Study of Word and Sentence Similarity Measures

Sentence similarity is considered the basis of many natural language tas...
research
02/09/2021

Decontextualization: Making Sentences Stand-Alone

Models for question answering, dialogue agents, and summarization often ...
research
06/14/2021

Improving Paraphrase Detection with the Adversarial Paraphrasing Task

If two sentences have the same meaning, it should follow that they are e...
research
12/14/2016

Interpretable Semantic Textual Similarity: Finding and explaining differences between sentences

User acceptance of artificial intelligence agents might depend on their ...
research
12/17/2018

Siamese Networks for Semantic Pattern Similarity

Semantic Pattern Similarity is an interesting, though not often encounte...
research
12/21/2021

Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Human-performed annotation of sentences in legal documents is an importa...

Please sign up or login with your details

Forgot password? Click here to reset