EUREKA: EUphemism Recognition Enhanced through Knn-based methods and Augmentation

10/23/2022
by   Sedrick Scott Keh, et al.
0

We introduce EUREKA, an ensemble-based approach for performing automatic euphemism detection. We (1) identify and correct potentially mislabelled rows in the dataset, (2) curate an expanded corpus called EuphAug, (3) leverage model representations of Potentially Euphemistic Terms (PETs), and (4) explore using representations of semantically close sentences to aid in classification. Using our augmented dataset and kNN-based methods, EUREKA was able to achieve state-of-the-art results on the public leaderboard of the Euphemism Detection Shared Task, ranking first with a macro F1 score of 0.881. Our code is available at https://github.com/sedrickkeh/EUREKA.

READ FULL TEXT

page 3

page 4

research
04/05/2019

NELEC at SemEval-2019 Task 3: Think Twice Before Going Deep

Existing Machine Learning techniques yield close to human performance on...
research
10/27/2022

Fully-attentive and interpretable: vision and video vision transformers for pain detection

Pain is a serious and costly issue globally, but to be treated, it must ...
research
12/22/2017

Leveraging Text and Knowledge Bases for Triple Scoring: An Ensemble Approach - The BOKCHOY Triple Scorer at WSDM Cup 2017

We present our winning solution for the WSDM Cup 2017 triple scoring tas...
research
10/11/2022

T5 for Hate Speech, Augmented Data and Ensemble

We conduct relatively extensive investigations of automatic hate speech ...
research
08/31/2023

A Sequential Framework for Detection and Classification of Abnormal Teeth in Panoramic X-rays

This paper describes our solution for the Dental Enumeration and Diagnos...
research
06/21/2019

Acute Lymphoblastic Leukemia Classification from Microscopic Images using Convolutional Neural Networks

Examining blood microscopic images for leukemia is necessary when expens...
research
02/25/2018

A Dataset To Evaluate The Representations Learned By Video Prediction Models

We present a parameterized synthetic dataset called Moving Symbols to su...

Please sign up or login with your details

Forgot password? Click here to reset