Automatic Summarization of Russian Texts: Comparison of Extractive and Abstractive Methods

06/18/2022
by   Valeriya Goloviznina, et al.
0

The development of large and super-large language models, such as GPT-3, T5, Switch Transformer, ERNIE, etc., has significantly improved the performance of text generation. One of the important research directions in this area is the generation of texts with arguments. The solution of this problem can be used in business meetings, political debates, dialogue systems, for preparation of student essays. One of the main domains for these applications is the economic sphere. The key problem of the argument text generation for the Russian language is the lack of annotated argumentation corpora. In this paper, we use translated versions of the Argumentative Microtext, Persuasive Essays and UKP Sentential corpora to fine-tune RuBERT model. Further, this model is used to annotate the corpus of economic news by argumentation. Then the annotated corpus is employed to fine-tune the ruGPT-3 model, which generates argument texts. The results show that this approach improves the accuracy of the argument generation by more than 20 percentage points (63.2 compared to the original ruGPT-3 model.

READ FULL TEXT

page 6

page 9

page 10

research
06/18/2022

Argumentative Text Generation in Economic Domain

The development of large and super-large language models, such as GPT-3,...
research
06/28/2021

Traditional Machine Learning and Deep Learning Models for Argumentation Mining in Russian Texts

Argumentation mining is a field of computational linguistics that is dev...
research
06/17/2020

Automatically Ranked Russian Paraphrase Corpus for Text Generation

The article is focused on automatic development and ranking of a large c...
research
11/03/2020

Semi-Supervised Cleansing of Web Argument Corpora

Debate portals and similar web platforms constitute one of the main text...
research
06/18/2022

RuArg-2022: Argument Mining Evaluation

Argumentation analysis is a field of computational linguistics that stud...
research
05/22/2020

The Discussion Tracker Corpus of Collaborative Argumentation

Although Natural Language Processing (NLP) research on argument mining h...
research
10/04/2021

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

In this paper, we present and implement a multi-dimensional, modular fra...

Please sign up or login with your details

Forgot password? Click here to reset