Ensemble Models for Detecting Wikidata Vandalism with Stacking - Team Honeyberry Vandalism Detector at WSDM Cup 2017

12/19/2017
by   Tomoya Yamazaki, et al.
0

The WSDM Cup 2017 is a binary classification task for classifying Wikidata revisions into vandalism and non-vandalism. This paper describes our method using some machine learning techniques such as under-sampling, feature selection, stacking and ensembles of models. We confirm the validity of each technique by calculating AUC-ROC of models using such techniques and not using them. Additionally, we analyze the results and gain useful insights into improving models for the vandalism detection task. The AUC-ROC of our final submission after the deadline resulted in 0.94412.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2020

RENT – Repeated Elastic Net Technique for Feature Selection

In this study we present the RENT feature selection method for binary cl...
research
09/17/2022

Detecting Generated Scientific Papers using an Ensemble of Transformer Models

The paper describes neural models developed for the DAGPap22 shared task...
research
08/22/2020

DUTH at SemEval-2020 Task 11: BERT with Entity Mapping for Propaganda Classification

This report describes the methods employed by the Democritus University ...
research
09/21/2016

Early Warning System for Seismic Events in Coal Mines Using Machine Learning

This document describes an approach to the problem of predicting dangero...
research
03/03/2023

Evaluation of Confidence-based Ensembling in Deep Learning Image Classification

Ensembling is a successful technique to improve the performance of machi...
research
07/13/2015

Supervised Hierarchical Classification for Student Answer Scoring

This paper describes a hierarchical system that predicts one label at a ...

Please sign up or login with your details

Forgot password? Click here to reset