Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish

06/17/2019
by   Renard Korzeniowski, et al.
0

This paper presents our contribution to PolEval 2019 Task 6: Hate speech and bullying detection. We describe three parallel approaches that we followed: fine-tuning a pre-trained ULMFiT model to our classification task, fine-tuning a pre-trained BERT model to our classification task, and using the TPOT library to find the optimal pipeline. We present results achieved by these three tools and review their advantages and disadvantages in terms of user experience. Our team placed second in subtask 2 with a shallow model found by TPOT: a logistic regression classifier with non-trivial feature engineering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2022

Cluster Tune: Boost Cold Start Performance in Text Classification

In real-world scenarios, a text classification task often begins with a ...
research
12/08/2021

Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training

Representation learning models for Knowledge Graphs (KG) have proven to ...
research
09/09/2022

Trigger Warnings: Bootstrapping a Violence Detector for FanFiction

We present the first dataset and evaluation results on a newly defined c...
research
05/03/2021

Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review

Technology-assisted review (TAR) refers to iterative active learning wor...
research
10/26/2022

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Automatic speech recognition (ASR) has been established as a well-perfor...
research
10/01/2022

Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications

Speech quality in online conferencing applications is typically assessed...
research
09/06/2021

Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

This paper presents the contribution of the Data Science Kitchen at Germ...

Please sign up or login with your details

Forgot password? Click here to reset