UTMN at SemEval-2020 Task 11: A Kitchen Solution to Automatic Propaganda Detection

by   Elena Mikhalkova, et al.

The article describes a fast solution to propaganda detection at SemEval-2020 Task 11, based onfeature adjustment. We use per-token vectorization of features and a simple Logistic Regressionclassifier to quickly test different hypotheses about our data. We come up with what seems to usthe best solution, however, we are unable to align it with the result of the metric suggested by theorganizers of the task. We test how our system handles class and feature imbalance by varying thenumber of samples of two classes (Propaganda and None) in the training set, the size of a contextwindow in which a token is vectorized and combination of vectorization means. The result of oursystem at SemEval2020 Task 11 is F-score=0.37.


Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic languages

This paper describes our participation in the shared task of hate speech...

Detecting Label Errors in Token Classification Data

Mislabeled examples are a common issue in real-world data, particularly ...

Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes

In sequence modeling tasks the token order matters, but this information...

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

In this work we propose a novel token-based training strategy that impro...

Please sign up or login with your details

Forgot password? Click here to reset