Fast Nearest-Neighbor Classification using RNN in Domains with Large Number of Classes

by   Gautam Singh, et al.

In scenarios involving text classification where the number of classes is large (in multiples of 10000s) and training samples for each class are few and often verbose, nearest neighbor methods are effective but very slow in computing a similarity score with training samples of every class. On the other hand, machine learning models are fast at runtime but training them adequately is not feasible using few available training samples per class. In this paper, we propose a hybrid approach that cascades 1) a fast but less-accurate recurrent neural network (RNN) model and 2) a slow but more-accurate nearest-neighbor model using bag of syntactic features. Using the cascaded approach, our experiments, performed on data set from IT support services where customer complaint text needs to be classified to return top-N possible error codes, show that the query-time of the slow system is reduced to 1/6^th while its accuracy is being improved. Our approach outperforms an LSH-based baseline for query-time reduction. We also derive a lower bound on the accuracy of the cascaded model in terms of the accuracies of the individual models. In any two-stage approach, choosing the right number of candidates to pass on to the second stage is crucial. We prove a result that aids in choosing this cutoff number for the cascaded system.


page 1

page 2

page 3

page 4


P3DC-Shot: Prior-Driven Discrete Data Calibration for Nearest-Neighbor Few-Shot Classification

Nearest-Neighbor (NN) classification has been proven as a simple and eff...

Choice of neighbor order in nearest-neighbor classification

The kth-nearest neighbor rule is arguably the simplest and most intuitiv...

Distributionally Robust k-Nearest Neighbors for Few-Shot Learning

Learning a robust classifier from a few samples remains a key challenge ...

k-Nearest Neighbor Augmented Neural Networks for Text Classification

In recent years, many deep-learning based models are proposed for text c...

Fast Neighborhood Graph Search using Cartesian Concatenation

In this paper, we propose a new data structure for approximate nearest n...

Maximum Likelihood Directed Enumeration Method in Piecewise-Regular Object Recognition

We explore the problems of classification of composite object (images, s...

Cascaded Classifier for Pareto-Optimal Accuracy-Cost Trade-Off Using off-the-Shelf ANNs

Machine-learning classifiers provide high quality of service in classifi...

Please sign up or login with your details

Forgot password? Click here to reset