DeepAI
Log In Sign Up

Exploring Multilingual Syntactic Sentence Representations

10/25/2019
by   Chen Liu, et al.
0

We study methods for learning sentence embeddings with syntactic structure. We focus on methods of learning syntactic sentence-embeddings by using a multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We evaluate the quality of the learned embeddings by examining sentence-level nearest neighbours and functional dissimilarity in the embedding space. We also evaluate the ability of the method to learn syntactic sentence-embeddings for low-resource languages and demonstrate strong evidence for transfer learning. Our results show that syntactic sentence-embeddings can be learned while using less training data, fewer model parameters, and resulting in better evaluation metrics than state-of-the-art language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/21/2021

Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining

Existing models of multilingual sentence embeddings require large parall...
10/25/2019

Evaluation of Sentence Representations in Polish

Methods for learning sentence representations have been actively develop...
04/06/2022

drsphelps at SemEval-2022 Task 2: Learning idiom representations using BERTRAM

This paper describes our system for SemEval-2022 Task 2 Multilingual Idi...
11/01/2018

A Stronger Baseline for Multilingual Word Embeddings

Levy, Søgaard and Goldberg's (2017) S-ID (sentence ID) method applies wo...
06/20/2019

Hierarchical Document Encoder for Parallel Corpus Mining

We explore using multilingual document embeddings for nearest neighbor m...
04/21/2020

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

We present an easy and efficient method to extend existing sentence embe...
02/22/2019

Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

In this paper, we present an approach to learn multilingual sentence emb...