ClimaBench: A Benchmark Dataset For Climate Change Text Understanding in English

01/11/2023
by   Tanmay Laud, et al.
0

The topic of Climate Change (CC) has received limited attention in NLP despite its real world urgency. Activists and policy-makers need NLP tools in order to effectively process the vast and rapidly growing textual data produced on CC. Their utility, however, primarily depends on whether the current state-of-the-art models can generalize across various tasks in the CC domain. In order to address this gap, we introduce Climate Change Benchmark (ClimaBench), a benchmark collection of existing disparate datasets for evaluating model performance across a diverse set of CC NLU tasks systematically. Further, we enhance the benchmark by releasing two large-scale labelled text classification and question-answering datasets curated from publicly available environmental disclosures. Lastly, we provide an analysis of several generic and CC-oriented models answering whether fine-tuning on domain text offers any improvements across these tasks. We hope this work provides a standard assessment tool for research on CC text data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Disfluencies is an under-studied topic in NLP, even though it is ubiquit...
research
12/01/2020

ClimaText: A Dataset for Climate Change Topic Detection

Climate change communication in the mass media and other textual sources...
research
10/03/2021

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Law, interpretations of law, legal arguments, agreements, etc. are typic...
research
03/21/2023

Fine-tuning ClimateBert transformer with ClimaText for the disclosure analysis of climate-related financial risks

In recent years there has been a growing demand from financial agents, e...
research
11/10/2019

Using LSTMs for climate change assessment studies on droughts and floods

Climate change affects occurrences of floods and droughts worldwide. How...
research
02/27/2023

Make Every Example Count: On Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets

Increasingly larger datasets have become a standard ingredient to advanc...

Please sign up or login with your details

Forgot password? Click here to reset