Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams

06/26/2021
by   Mahardhika Pratama, et al.
0

The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction (DA^3) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only 25% label proportions. It shows highly competitive performance even if compared with fully supervised learners with 100% label proportions.

READ FULL TEXT
research
05/10/2021

Boosting Semi-Supervised Face Recognition with Noise Robustness

Although deep face recognition benefits significantly from large-scale t...
research
11/03/2019

Weakly Supervised Deep Learning Approach in Streaming Environments

The feasibility of existing data stream algorithms is often hindered by ...
research
05/28/2019

Local Label Propagation for Large-Scale Semi-Supervised Learning

A significant issue in training deep neural networks to solve supervised...
research
07/18/2018

Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

Many distributed machine learning frameworks have recently been built to...
research
04/15/2021

Personalized Semi-Supervised Federated Learning for Human Activity Recognition

The most effective data-driven methods for human activities recognition ...
research
05/18/2022

SemiCurv: Semi-Supervised Curvilinear Structure Segmentation

Recent work on curvilinear structure segmentation has mostly focused on ...
research
05/14/2014

Active Mining of Parallel Video Streams

The practicality of a video surveillance system is adversely limited by ...

Please sign up or login with your details

Forgot password? Click here to reset