GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages

12/25/2022
by   Lakshmi Sireesha Vakada, et al.
0

Document summarization aims to create a precise and coherent summary of a text document. Many deep learning summarization models are developed mainly for English, often requiring a large training corpus and efficient pre-trained language models and tools. However, English summarization models for low-resource Indian languages are often limited by rich morphological variation, syntax, and semantic differences. In this paper, we propose GAE-ISumm, an unsupervised Indic summarization model that extracts summaries from text documents. In particular, our proposed model, GAE-ISumm uses Graph Autoencoder (GAE) to learn text representations and a document summary jointly. We also provide a manually-annotated Telugu summarization dataset TELSUM, to experiment with our model GAE-ISumm. Further, we experiment with the most publicly available Indian language summarization datasets to investigate the effectiveness of GAE-ISumm on other Indian languages. Our experiments of GAE-ISumm in seven languages make the following observations: (i) it is competitive or better than state-of-the-art results on all datasets, (ii) it reports benchmark results on TELSUM, and (iii) the inclusion of positional and cluster information in the proposed model improved the performance of summaries.

READ FULL TEXT
research
01/26/2021

Unsupervised Abstractive Summarization of Bengali Text Documents

Abstractive summarization systems generally rely on large collections of...
research
04/03/2019

Jointly Extracting and Compressing Documents with Summary State Representations

We present a new neural model for text summarization that first extracts...
research
10/24/2018

A Multilingual Study of Compressive Cross-Language Text Summarization

Cross-Language Text Summarization (CLTS) generates summaries in a langua...
research
12/06/2021

An unsupervised extractive summarization method based on multi-round computation

Text summarization methods have attracted much attention all the time. I...
research
07/31/2018

An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization

The fast-growing amount of information on the Internet makes the researc...
research
12/15/2022

Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization

The goal of multimodal abstractive summarization (MAS) is to produce a c...
research
03/30/2022

An Overview of Indian Language Datasets used for Text Summarization

In this paper, we survey Text Summarization (TS) datasets in Indian Lang...

Please sign up or login with your details

Forgot password? Click here to reset