String Transduction with Target Language Models and Insertion Handling

09/19/2018
by   Garrett Nicolai, et al.
0

Many character-level tasks can be framed as sequence-to-sequence transduction, where the target is a word from a natural language. We show that leveraging target language models derived from unannotated target corpora, combined with a precise alignment of the training data, yields state-of-the art results on cognate projection, inflection generation, and phoneme-to-grapheme conversion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2018

Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity

We present a comparison of word-based and character-based sequence-to-se...
research
07/04/2017

CharManteau: Character Embedding Models For Portmanteau Creation

Portmanteaus are a word formation phenomenon where two words are combine...
research
02/16/2021

Searching for Search Errors in Neural Morphological Inflection

Neural sequence-to-sequence models are currently the predominant choice ...
research
04/06/2022

Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

The field of Natural Language Processing (NLP) has experienced a dramati...
research
04/17/2021

Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

Cryptic crosswords, the dominant English-language crossword variety in t...
research
06/16/2019

Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B

In this paper we propose a novel neural approach for automatic decipherm...
research
10/14/2022

Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks

Sequence-to-Sequence (seq2seq) tasks transcribe the input sequence to a ...

Please sign up or login with your details

Forgot password? Click here to reset