DuTrust: A Sentiment Analysis Dataset for Trustworthiness Evaluation

08/30/2021
by   Lijie Wang, et al.
0

While deep learning models have greatly improved the performance of most artificial intelligence tasks, they are often criticized to be untrustworthy due to the black-box problem. Consequently, many works have been proposed to study the trustworthiness of deep learning. However, as most open datasets are designed for evaluating the accuracy of model outputs, there is still a lack of appropriate datasets for evaluating the inner workings of neural networks. The lack of datasets obviously hinders the development of trustworthiness research. Therefore, in order to systematically evaluate the factors for building trustworthy systems, we propose a novel and well-annotated sentiment analysis dataset to evaluate robustness and interpretability. To evaluate these factors, our dataset contains diverse annotations about the challenging distribution of instances, manual adversarial instances and sentiment explanations. Several evaluation metrics are further proposed for interpretability and robustness. Based on the dataset and metrics, we conduct comprehensive comparisons for the trustworthiness of three typical models, and also study the relations between accuracy, robustness and interpretability. We release this trustworthiness evaluation dataset at <https://github/xyz> and hope our work can facilitate the progress on building more trustworthy systems for real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2022

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

While there is increasing concern about the interpretability of neural m...
research
03/19/2021

Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Deep neural networks have been well-known for their superb performance i...
research
05/31/2021

SA2SL: From Aspect-Based Sentiment Analysis to Social Listening System for Business Intelligence

In this paper, we present a process of building a social listening syste...
research
01/24/2021

A Comprehensive Evaluation Framework for Deep Model Robustness

Deep neural networks (DNNs) have achieved remarkable performance across ...
research
07/23/2019

Interpretable and Steerable Sequence Learning via Prototypes

One of the major challenges in machine learning nowadays is to provide p...
research
01/25/2023

Towards Robust Metrics for Concept Representation Evaluation

Recent work on interpretability has focused on concept-based explanation...
research
11/25/2022

Deep Learning Training Procedure Augmentations

Recent advances in Deep Learning have greatly improved performance on va...

Please sign up or login with your details

Forgot password? Click here to reset