A Generative Approach to Titling and Clustering Wikipedia Sections

05/22/2020
by   Anjalie Field, et al.
0

We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2018

Generating Wikipedia by Summarizing Long Sequences

We show that generating English Wikipedia articles can be approached as ...
research
04/12/2022

Generating Full Length Wikipedia Biographies: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

Generating factual, long-form text such as Wikipedia articles raises thr...
research
09/03/2018

Hypernyms Through Intra-Article Organization in Wikipedia

We introduce a new measure for unsupervised hypernym detection and direc...
research
03/22/2023

XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages

Lack of encyclopedic text contributors, especially on Wikipedia, makes a...
research
12/29/2020

Generating Wikipedia Article Sections from Diverse Data Sources

Datasets for data-to-text generation typically focus either on multi-dom...
research
11/21/2019

An Empirical Study of Sections in Classifying Disease Outbreak Reports

Identifying articles that relate to infectious diseases is a necessary s...
research
12/16/2021

FRUIT: Faithfully Reflecting Updated Information in Text

Textual knowledge bases such as Wikipedia require considerable effort to...

Please sign up or login with your details

Forgot password? Click here to reset