Few Clean Instances Help Denoising Distant Supervision

09/14/2022
by   Yufang Liu, et al.
0

Existing distantly supervised relation extractors usually rely on noisy data for both model training and evaluation, which may lead to garbage-in-garbage-out systems. To alleviate the problem, we study whether a small clean dataset could help improve the quality of distantly supervised models. We show that besides getting a more convincing evaluation of models, a small clean dataset also helps us to build more robust denoising models. Specifically, we propose a new criterion for clean instance selection based on influence functions. It collects sample-level evidence for recognizing good instances (which is more informative than loss-level evidence). We also propose a teacher-student mechanism for controlling purity of intermediate results when bootstrapping the clean set. The whole approach is model-agnostic and demonstrates strong performances on both denoising real (NYT) and synthetic noisy datasets.

READ FULL TEXT
research
02/18/2023

One-Pot Multi-Frame Denoising

The performance of learning-based denoising largely depends on clean sup...
research
04/25/2022

Self-supervision versus synthetic datasets: which is the lesser evil in the context of video denoising?

Supervised training has led to state-of-the-art results in image and vid...
research
07/31/2019

Few-Shot Meta-Denoising

We study the problem of learning-based denoising where the training set ...
research
10/26/2020

Meta-Learning for Neural Relation Classification with Distant Supervision

Distant supervision provides a means to create a large number of weakly ...
research
01/02/2023

Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

A noisy training set usually leads to the degradation of the generalizat...
research
03/27/2020

GPVAD: Towards noise robust voice activity detection via weakly supervised sound event detection

Traditional voice activity detection (VAD) methods work well in clean an...
research
05/01/2023

RViDeformer: Efficient Raw Video Denoising Transformer with a Larger Benchmark Dataset

In recent years, raw video denoising has garnered increased attention du...

Please sign up or login with your details

Forgot password? Click here to reset