Leveraging Domain Knowledge for Inclusive and Bias-aware Humanitarian Response Entry Classification

by   Nicolò Tamagnone, et al.

Accurate and rapid situation analysis during humanitarian crises is critical to delivering humanitarian aid efficiently and is fundamental to humanitarian imperatives and the Leave No One Behind (LNOB) principle. This data analysis can highly benefit from language processing systems, e.g., by classifying the text data according to a humanitarian ontology. However, approaching this by simply fine-tuning a generic large language model (LLM) involves considerable practical and ethical issues, particularly the lack of effectiveness on data-sparse and complex subdomains, and the encoding of societal biases and unwanted associations. In this work, we aim to provide an effective and ethically-aware system for humanitarian data analysis. We approach this by (1) introducing a novel architecture adjusted to the humanitarian analysis framework, (2) creating and releasing a novel humanitarian-specific LLM called HumBert, and (3) proposing a systematic way to measure and mitigate biases. Our experiments' results show the better performance of our approach on zero-shot and full-training settings in comparison with strong baseline models, while also revealing the existence of biases in the resulting LLMs. Utilizing a targeted counterfactual data augmentation approach, we significantly reduce these biases without compromising performance.


Targeted Data Augmentation for bias mitigation

The development of fair and ethical AI systems requires careful consider...

An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

The increasingly large size of modern pretrained language models not onl...

Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating Spurious Correlations in Entity Typing

The entity typing task aims at predicting one or more words or phrases t...

How User Language Affects Conflict Fatality Estimates in ChatGPT

OpenAI's ChatGPT language model has gained popularity as a powerful tool...

Debiasing Gender Bias in Information Retrieval Models

Biases in culture, gender, ethnicity, etc. have existed for decades and ...

Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases

As the representation capability of Pre-trained Language Models (PLMs) i...

Please sign up or login with your details

Forgot password? Click here to reset