Learning Better Sentence Representation with Syntax Information

01/09/2021
by   Chen Yang, et al.
0

Sentence semantic understanding is a key topic in the field of natural language processing. Recently, contextualized word representations derived from pre-trained language models such as ELMO and BERT have shown significant improvements for a wide range of semantic tasks, e.g. question answering, text classification and sentiment analysis. However, how to add external knowledge to further improve the semantic modeling capability of model is worth probing. In this paper, we propose a novel approach to combining syntax information with a pre-trained language model. In order to evaluate the effect of the pre-training model, first, we introduce RNN-based and Transformer-based pre-trained language models; secondly, to better integrate external knowledge, such as syntactic information integrate with the pre-training model, we propose a dependency syntax expansion (DSE) model. For evaluation, we have selected two subtasks: sentence completion task and biological relation extraction task. The experimental results show that our model achieves 91.2% accuracy, outperforming the baseline model by 37.8% on sentence completion task. And it also gets competitive performance by 75.1% F_1 score on relation extraction task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2021

Disentangling Semantics and Syntax in Sentence Embeddings with Pre-trained Language Models

Pre-trained language models have achieved huge success on a wide range o...
research
09/17/2021

Language Models as a Knowledge Source for Cognitive Agents

Language models (LMs) are sentence-completion engines trained on massive...
research
04/15/2022

Improving Pre-trained Language Models with Syntactic Dependency Prediction Task for Chinese Semantic Error Recognition

Existing Chinese text error detection mainly focuses on spelling and sim...
research
08/20/2020

Do Syntax Trees Help Pre-trained Transformers Extract Information?

Much recent work suggests that incorporating syntax information from dep...
research
12/30/2020

Enhancing Pre-trained Language Model with Lexical Simplification

For both human readers and pre-trained language models (PrLMs), lexical ...
research
11/28/2019

Inducing Relational Knowledge from BERT

One of the most remarkable properties of word embeddings is the fact tha...
research
03/29/2021

Whitening Sentence Representations for Better Semantics and Faster Retrieval

Pre-training models such as BERT have achieved great success in many nat...

Please sign up or login with your details

Forgot password? Click here to reset