From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation

06/06/2019
by   Yu Liu, et al.
3

Many important classification problems, such as object classification, speech recognition, and machine translation, have been tackled by the supervised learning paradigm in the past, where training corpora of parallel input-output pairs are required with high cost. To remove the need for the parallel training corpora has practical significance for real-world applications, and it is one of the main goals of unsupervised learning. Recently, encouraging progress in unsupervised learning for solving such classification problems has been made and the nature of the challenges has been clarified. In this article, we review this progress and disseminate a class of promising new methods to facilitate understanding the methods for machine learning researchers. In particular, we emphasize the key information that enables the success of unsupervised learning - the sequential statistics as the distributional prior in the labels. Exploitation of such sequential statistics makes it possible to estimate parameters of classifiers without the need of paired input-output data. In this paper, we first introduce the concept of Caesar Cipher and its decryption, which motivated the construction of the novel loss function for unsupervised learning we use throughout the paper. Then we use a simple but representative binary classification task as an example to derive and describe the unsupervised learning algorithm in a step-by-step, easy-to-understand fashion. We include two cases, one with Bigram language model as the sequential statistics for use in unsupervised parameter estimation, and another with a simpler Unigram language model. For both cases, detailed derivation steps for the learning algorithm are included. Further, a summary table compares computational steps of the two cases in executing the unsupervised learning algorithm for learning binary classifiers.

READ FULL TEXT

page 7

page 9

page 13

page 14

research
11/19/2015

Towards Principled Unsupervised Learning

General unsupervised learning is a long-standing conceptual problem in m...
research
12/23/2018

Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching

We consider the problem of training speech recognition systems without u...
research
06/25/2018

An Unsupervised Learning Classifier with Competitive Error Performance

An unsupervised learning classification model is described. It achieves ...
research
06/27/2017

Unsupervised Learning via Total Correlation Explanation

Learning by children and animals occurs effortlessly and largely without...
research
11/19/2015

Patterns for Learning with Side Information

Supervised, semi-supervised, and unsupervised learning estimate a functi...
research
05/28/2022

Speech Augmentation Based Unsupervised Learning for Keyword Spotting

In this paper, we investigated a speech augmentation based unsupervised ...
research
01/14/2020

Unsupervised Learning of the Set of Local Maxima

This paper describes a new form of unsupervised learning, whose input is...

Please sign up or login with your details

Forgot password? Click here to reset