A Data Fusion Framework for Multi-Domain Morality Learning

04/04/2023
by   Siyi Guo, et al.
7

Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.

READ FULL TEXT

page 5

page 6

research
03/25/2021

THAT: Two Head Adversarial Training for Improving Robustness at Scale

Many variants of adversarial training have been proposed, with most rese...
research
02/15/2022

Label fusion and training methods for reliable representation of inter-rater uncertainty

Medical tasks are prone to inter-rater variability due to multiple facto...
research
10/06/2022

Augmentor or Filter? Reconsider the Role of Pre-trained Language Model in Text Classification Augmentation

Text augmentation is one of the most effective techniques to solve the c...
research
07/30/2022

Learning Feature Decomposition for Domain Adaptive Monocular Depth Estimation

Monocular depth estimation (MDE) has attracted intense study due to its ...
research
07/14/2023

Adversarial Training Over Long-Tailed Distribution

In this paper, we study adversarial training on datasets that obey the l...
research
02/27/2020

Improving Learning Effectiveness For Object Detection and Classification in Cluttered Backgrounds

Usually, Neural Networks models are trained with a large dataset of imag...
research
01/31/2022

Adaptive Sampling Strategies to Construct Equitable Training Datasets

In domains ranging from computer vision to natural language processing, ...

Please sign up or login with your details

Forgot password? Click here to reset