Knowledge Efficient Deep Learning for Natural Language Processing

08/28/2020
by   Hai Wang, et al.
0

Deep learning has become the workhorse for a wide range of natural language processing applications. But much of the success of deep learning relies on annotated examples. Annotation is time-consuming and expensive to produce at scale. Here we are interested in methods for reducing the required quantity of annotated data – by making the learning methods more knowledge efficient so as to make them more applicable in low annotation (low resource) settings. There are various classical approaches to making the models more knowledge efficient such as multi-task learning, transfer learning, weakly supervised and unsupervised learning etc. This thesis focuses on adapting such classical methods to modern deep learning models and algorithms. This thesis describes four works aimed at making machine learning models more knowledge efficient. First, we propose a knowledge rich deep learning model (KRDL) as a unifying learning framework for incorporating prior knowledge into deep models. In particular, we apply KRDL built on Markov logic networks to denoise weak supervision. Second, we apply a KRDL model to assist the machine reading models to find the correct evidence sentences that can support their decision. Third, we investigate the knowledge transfer techniques in multilingual setting, where we proposed a method that can improve pre-trained multilingual BERT based on the bilingual dictionary. Fourth, we present an episodic memory network for language modelling, in which we encode the large external knowledge for the pre-trained GPT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2019

Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks

Pre-trained word embeddings are the primary method for transfer learning...
research
09/26/2019

Improving Pre-Trained Multilingual Models with Vocabulary Expansion

Recently, pre-trained language models have achieved remarkable success i...
research
06/20/2022

Great Expectations: Unsupervised Inference of Suspense, Surprise and Salience in Storytelling

Stories interest us not because they are a sequence of mundane and predi...
research
05/03/2022

XLTime: A Cross-Lingual Knowledge Transfer Framework for Temporal Expression Extraction

Temporal Expression Extraction (TEE) is essential for understanding time...
research
05/19/2022

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

The ever-growing model size and scale of compute have attracted increasi...
research
07/03/2022

Understanding Tieq Viet with Deep Learning Models

Deep learning is a powerful approach in recovering lost information as w...
research
05/31/2019

Knowledge-augmented Column Networks: Guiding Deep Learning with Advice

Recently, deep models have had considerable success in several tasks, es...

Please sign up or login with your details

Forgot password? Click here to reset