So What's the Plan? Mining Strategic Planning Document

07/01/2020
by   Ekaterina Artemova, et al.
0

In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next texts are marked up using human-in-the-loop strategy, so that preliminary annotations are derived from a machine learning model and are manually corrected. The amount of annotated texts is large enough to showcase what insights can be gained from RuREBus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2020

So What's the Plan? Mining Strategic Planning Documents

In this paper we present a corpus of Russian strategic planning document...
research
04/12/2022

The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts

We present the Project Dialogism Novel Corpus, or PDNC, an annotated dat...
research
03/05/2019

Language and Dialect Identification of Cuneiform Texts

This article introduces a corpus of cuneiform texts from which the datas...
research
01/26/2021

A Digital Corpus of St. Lawrence Island Yupik

St. Lawrence Island Yupik (ISO 639-3: ess) is an endangered polysyntheti...
research
04/02/2020

NUBES: A Corpus of Negation and Uncertainty in Spanish Clinical Texts

This paper introduces the first version of the NUBes corpus (Negation an...
research
08/27/2022

Quantifying French Document Complexity

Measuring a document's complexity level is an open challenge, particular...
research
09/27/2019

Multi-Modal Citizen Science: From Disambiguation to Transcription of Classical Literature

The engagement of citizens in the research projects, including Digital H...

Please sign up or login with your details

Forgot password? Click here to reset