Matching Natural Language Sentences with Hierarchical Sentence Factorization

03/01/2018
by   Bang Liu, et al.
0

Semantic matching of natural language sentences or identifying the relationship between two sentences is a core research problem underlying many natural language tasks. Depending on whether training data is available, prior research has proposed both unsupervised distance-based schemes and supervised deep learning schemes for sentence matching. However, previous approaches either omit or fail to fully utilize the ordered, hierarchical, and flexible structures of language objects, as well as the interactions between them. In this paper, we propose Hierarchical Sentence Factorization---a technique to factorize a sentence into a hierarchical representation, with the components at each different scale reordered into a "predicate-argument" form. The proposed sentence factorization technique leads to the invention of: 1) a new unsupervised distance metric which calculates the semantic distance between a pair of text snippets by solving a penalized optimal transport problem while preserving the logical relationship of words in the reordered sentences, and 2) new multi-scale deep learning models for supervised semantic training, based on factorized sentence hierarchies. We apply our techniques to text-pair similarity estimation and text-pair relationship classification tasks, based on multiple datasets such as STSbenchmark, the Microsoft Research paraphrase identification (MSRP) dataset, the SICK dataset, etc. Extensive experiments show that the proposed hierarchical sentence factorization can be used to significantly improve the performance of existing unsupervised distance-based metrics as well as multiple supervised deep learning models based on the convolutional neural network (CNN) and long short-term memory (LSTM).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2015

A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations

Matching natural language sentences is central for many applications suc...
research
02/13/2017

Bilateral Multi-Perspective Matching for Natural Language Sentences

Natural language sentence matching is a fundamental technology for a var...
research
05/07/2015

Jointly Modeling Embedding and Translation to Bridge Video and Language

Automatically describing video content with natural language is a fundam...
research
03/11/2015

Convolutional Neural Network Architectures for Matching Natural Language Sentences

Semantic matching is of central importance to many natural language task...
research
05/17/2021

Sentence Similarity Based on Contexts

Existing methods to measure sentence similarity are faced with two chall...
research
10/06/2020

Semantically Driven Sentence Fusion: Modeling and Evaluation

Sentence fusion is the task of joining related sentences into coherent t...
research
06/08/2020

CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

Two important tasks at the intersection of knowledge graphs and natural ...

Please sign up or login with your details

Forgot password? Click here to reset