Clustering Contextualized Representations of Text for Unsupervised Syntax Induction

10/24/2020
by   Vikram Gupta, et al.
0

We explore clustering of contextualized text representations for two unsupervised syntax induction tasks: part of speech induction (POSI) and constituency labelling (CoLab). We propose a deep embedded clustering approach which jointly transforms these representations into a lower dimension cluster friendly space and clusters them. We further enhance these representations by augmenting them with task-specific representations. We also explore the effectiveness of multilingual representations for different tasks and languages. With this work, we establish the first strong baselines for unsupervised syntax induction using contextualized text representations. We report competitive performance on 45-tag POSI, state-of-the-art performance on 12-tag POSI across 10 languages, and competitive results on CoLab.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2019

HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings

We present our system for semantic frame induction that showed the best ...
research
06/30/2022

Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?

Previous Part-Of-Speech (POS) induction models usually assume certain in...
research
05/12/2018

Unsupervised Semantic Frame Induction using Triclustering

We use dependency triples automatically extracted from a Web-scale corpu...
research
09/03/2019

Duality Regularization for Unsupervised Bilingual Lexicon Induction

Unsupervised bilingual lexicon induction naturally exhibits duality, whi...
research
12/15/2021

Text Mining Through Label Induction Grouping Algorithm Based Method

The main focus of information retrieval methods is to provide accurate a...
research
01/31/2020

Unsupervised Bilingual Lexicon Induction Across Writing Systems

Recent embedding-based methods in unsupervised bilingual lexicon inducti...
research
11/30/2020

A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction

Unsupervised Bilingual Dictionary Induction methods based on the initial...

Please sign up or login with your details

Forgot password? Click here to reset