Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis

11/30/2021
by   Albert Tseng, et al.
5

Obtaining annotations for large training sets is expensive, especially in behavior analysis settings where domain knowledge is required for accurate annotations. Weak supervision has been studied to reduce annotation costs by using weak labels from task-level labeling functions to augment ground truth labels. However, domain experts are still needed to hand-craft labeling functions for every studied task. To reduce expert effort, we present AutoSWAP: a framework for automatically synthesizing data-efficient task-level labeling functions. The key to our approach is to efficiently represent expert knowledge in a reusable domain specific language and domain-level labeling functions, with which we use state-of-the-art program synthesis techniques and a small labeled dataset to generate labeling functions. Additionally, we propose a novel structural diversity cost that allows for direct synthesis of diverse sets of labeling functions with minimal overhead, further improving labeling function data efficiency. We evaluate AutoSWAP in three behavior analysis domains and demonstrate that AutoSWAP outperforms existing approaches using only a fraction of the data. Our results suggest that AutoSWAP is an effective way to automatically generate labeling functions that can significantly reduce expert effort for behavior analysis.

READ FULL TEXT

page 11

page 12

11/27/2020

Task Programming: Learning Data Efficient Behavior Representations

Specialized domain knowledge is often necessary to accurately annotate t...
04/28/2022

WeaNF: Weak Supervision with Normalizing Flows

A popular approach to decrease the need for costly manual annotation of ...
04/13/2022

Label Augmentation with Reinforced Labeling for Weak Supervision

Weak supervision (WS) is an alternative to the traditional supervised le...
10/09/2019

Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling

Sequence labeling is a fundamental framework for various natural languag...
06/24/2021

TagRuler: Interactive Tool for Span-Level Data Programming by Demonstration

Despite rapid developments in the field of machine learning research, co...
06/11/2021

Interpreting Expert Annotation Differences in Animal Behavior

Hand-annotated data can vary due to factors such as subjective differenc...
08/30/2022

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

Weak supervision (WS) is a powerful method to build labeled datasets for...