Adversarial Transfer Learning for Punctuation Restoration

04/01/2020
by   Jiangyan Yi, et al.
0

Previous studies demonstrate that word embeddings and part-of-speech (POS) tags are helpful for punctuation restoration tasks. However, two drawbacks still exist. One is that word embeddings are pre-trained by unidirectional language modeling objectives. Thus the word embeddings only contain left-to-right context information. The other is that POS tags are provided by an external POS tagger. So computation cost will be increased and incorrect predicted tags may affect the performance of restoring punctuation marks during decoding. This paper proposes adversarial transfer learning to address these problems. A pre-trained bidirectional encoder representations from transformers (BERT) model is used to initialize a punctuation model. Thus the transferred model parameters carry both left-to-right and right-to-left representations. Furthermore, adversarial multi-task learning is introduced to learn task invariant knowledge for punctuation prediction. We use an extra POS tagging task to help the training of the punctuation predicting task. Adversarial training is utilized to prevent the shared parameters from containing task specific information. We only use the punctuation predicting task to restore marks during decoding stage. Therefore, it will not need extra computation and not introduce incorrect tags from the POS tagger. Experiments are conducted on IWSLT2011 datasets. The results demonstrate that the punctuation predicting models obtain further performance improvement with task invariant knowledge from the POS tagging task. Our best model outperforms the previous state-of-the-art model trained only with lexical features by up to 9.2 absolute overall F_1-score on test set.

READ FULL TEXT

page 1

page 10

research
07/21/2017

Reconstruction of Word Embeddings from Sub-Word Parameters

Pre-trained word embeddings improve the performance of a neural model at...
research
03/20/2022

g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin

Polyphone disambiguation is the most crucial task in Mandarin grapheme-t...
research
06/28/2019

Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks

Pre-trained word embeddings are the primary method for transfer learning...
research
04/07/2021

Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification

We tackle the problem of identifying metaphors in text, treated as a seq...
research
04/29/2017

Semi-supervised sequence tagging with bidirectional language models

Pre-trained word embeddings learned from unlabeled text have become a st...
research
12/31/2016

Expanding Subjective Lexicons for Social Media Mining with Embedding Subspaces

Recent approaches for sentiment lexicon induction have capitalized on pr...
research
04/17/2021

Embodying Pre-Trained Word Embeddings Through Robot Actions

We propose a promising neural network model with which to acquire a grou...

Please sign up or login with your details

Forgot password? Click here to reset