Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals

01/25/2023
by   Dirk U. Wulff, et al.
0

A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of systems using a variety of text sources and show that systems differ considerably in their specificity (i.e., true-positive rate) and sensitivity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools labeling systems alleviates some of these limitations, exceeding the labeling performance of all currently available systems. We conclude that researchers and policymakers should care about the choice of labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2018

Design Challenges and Misconceptions in Neural Sequence Labeling

We investigate the design challenges of constructing effective and effic...
research
10/12/2021

text2sdg: An open-source solution to monitoring sustainable development goals from text

Monitoring progress on the United Nations Sustainable Development Goals ...
research
08/02/2022

Binary Classification with Positive Labeling Sources

To create a large amount of training labels for machine learning models ...
research
08/05/2021

DeepScanner: a Robotic System for Automated 2D Object Dataset Collection with Annotations

In the proposed study, we describe the possibility of automated dataset ...
research
04/01/2020

Deep Learning Based Multi-Label Text Classification of UNGA Resolutions

The main goal of this research is to produce a useful software for Unite...
research
05/02/2020

Single Model Ensemble using Pseudo-Tags and Distinct Vectors

Model ensemble techniques often increase task performance in neural netw...
research
01/08/2023

MEGAnno: Exploratory Labeling for NLP in Computational Notebooks

We present MEGAnno, a novel exploratory annotation framework designed fo...

Please sign up or login with your details

Forgot password? Click here to reset