UsingWord Embedding for Cross-Language Plagiarism Detection

02/10/2017
by   J. Ferrero, et al.
0

This paper proposes to use distributed representation of words (word embeddings) in cross-language textual similarity detection. The main contributions of this paper are the following: (a) we introduce new cross-language similarity detection methods based on distributed representation of words; (b) we combine the different methods proposed to verify their complementarity and finally obtain an overall F1 score of 89.15 English-French similarity detection at chunk level (88.5 a very challenging corpus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2017

CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity

We present our submitted systems for Semantic Textual Similarity (STS) T...
research
05/14/2018

Effects of Word Embeddings on Neural Network-based Pitch Accent Detection

Pitch accent detection often makes use of both acoustic and lexical feat...
research
05/24/2017

Deep Investigation of Cross-Language Plagiarism Detection Methods

This paper is a deep investigation of cross-language plagiarism detectio...
research
11/15/2017

An Unsupervised Approach for Mapping between Vector Spaces

We present a language independent, unsupervised approach for transformin...
research
06/06/2018

The Limitations of Cross-language Word Embeddings Evaluation

The aim of this work is to explore the possible limitations of existing ...
research
05/22/2020

Living Machines: A study of atypical animacy

This paper proposes a new approach to animacy detection, the task of det...
research
05/20/2020

GM-CTSC at SemEval-2020 Task 1: Gaussian Mixtures Cross Temporal Similarity Clustering

This paper describes the system proposed for the SemEval-2020 Task 1: Un...

Please sign up or login with your details

Forgot password? Click here to reset