A Novel ILP Framework for Summarizing Content with High Lexical Variety

07/25/2018
by   Wencan Luo, et al.
0

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.

READ FULL TEXT
research
10/15/2021

Modeling Endorsement for Multi-Document Abstractive Summarization

A crucial difference between single- and multi-document summarization is...
research
05/25/2018

An Improved Phrase-based Approach to Annotating and Summarizing Student Course Responses

Teaching large classes remains a great challenge, primarily because it i...
research
10/22/2022

Salience Allocation as Guidance for Abstractive Summarization

Abstractive summarization models typically learn to capture the salient ...
research
10/12/2021

SportsSum2.0: Generating High-Quality Sports News from Live Text Commentary

Sports game summarization aims to generate news articles from live text ...
research
11/28/2019

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

Keyphrase generation is the task of predicting a set of lexical units th...
research
04/22/2018

Neural Sentence Location Prediction for Summarization

A competitive baseline in sentence-level extractive summarization of new...
research
04/13/2021

Journals Titles and Mission Statements: Lexical structure, diversity and readability

There is an established research agenda on dissecting an articles compon...

Please sign up or login with your details

Forgot password? Click here to reset