Inferring Offensiveness In Images From Natural Language Supervision

10/08/2021
by   Patrick Schramowski, et al.
6

Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data. Unfortunately, these approaches also entail severe risks. In particular, large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images, and may also underrepresent specific classes. Consequently, there is an urgent need to carefully document datasets and curate their content. Unfortunately, this process is tedious and error-prone. We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets. Based on human-annotated examples and the implicit knowledge of a CLIP based model, we demonstrate that one can select relevant prompts for rating the offensiveness of an image. In addition to e.g. privacy violation and pornographic content previously identified in ImageNet, we demonstrate that our approach identifies further inappropriate and potentially offensive content.

READ FULL TEXT

page 2

page 9

research
08/20/2023

Large Transformers are Better EEG Learners

Pre-trained large transformer models have achieved remarkable performanc...
research
02/14/2022

Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?

Large datasets underlying much of current machine learning raise serious...
research
11/13/2017

PRE-render Content Using Tiles (PRECUT). 1. Large-Scale Compound-Target Relationship Analyses

Visualizing a complex network is computationally intensive process and d...
research
05/23/2023

Eliminating Spurious Correlations from Pre-trained Models via Data Mixing

Machine learning models pre-trained on large datasets have achieved rema...
research
10/06/2021

Improving Fractal Pre-training

The deep neural networks used in modern computer vision systems require ...
research
09/16/2021

Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision

In this paper I investigate the effect of random seed selection on the a...
research
12/14/2019

Deep Poisoning Functions: Towards Robust Privacy-safe Image Data Sharing

As deep networks are applied to an ever-expanding set of computer vision...

Please sign up or login with your details

Forgot password? Click here to reset