Descartes: Generating Short Descriptions of Wikipedia Articles

05/20/2022
by   Marija Sakota, et al.
0

We introduce and tackle the problem of automatically generating short descriptions of Wikipedia articles (e.g., Belgium has a short description Country in Western Europe). We introduce Descartes, a model that can generate descriptions performing on par with human editors. Our human evaluation results indicate that Descartes is preferred over editor-written descriptions about 50 of time. Further manual analysis show that Descartes generates descriptions considered as "valid" for 91.3 descriptions. Such performances are made possible by integrating other signals naturally existing in Wikipedia: (i) articles about the same entity in different languages, (ii) existing short descriptions in other languages, and (iii) structural information from Wikidata. Our work has direct practical applications in helping Wikipedia editors to provide short descriptions for the more than 9 million articles still missing one. Finally, our proposed architecture can easily be re-purposed to address other information gaps in Wikipedia.

READ FULL TEXT
research
06/09/2021

DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions

Short textual descriptions of entities provide summaries of their key at...
research
01/30/2018

Generating Wikipedia by Summarizing Long Sequences

We show that generating English Wikipedia articles can be approached as ...
research
06/27/2019

BioGen: Automated Biography Generation

A biography of a person is the detailed description of several life even...
research
09/01/2016

How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions

How much is 131 million US dollars? To help readers put such numbers in ...
research
07/26/2023

Measuring Americanization: A Global Quantitative Study of Interest in American Topics on Wikipedia

We conducted a global comparative analysis of the coverage of American t...
research
04/10/2023

WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

In this paper, we introduce a new NLP task – generating short factual ar...
research
12/02/2021

PLSUM: Generating PT-BR Wikipedia by Summarizing Multiple Websites

Wikipedia is an important free source of intelligible knowledge. Despite...

Please sign up or login with your details

Forgot password? Click here to reset