Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

01/09/2020
by   Xi Yan, et al.
34

Transfer learning has proven to be a successful technique to train deep learning models in the domains where little training data is available. The dominant approach is to pretrain a model on a large generic dataset such as ImageNet and finetune its weights on the target domain. However, in the new era of an ever-increasing number of massive datasets, selecting the relevant data for pretraining is a critical issue. We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain. Our NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client, an end-user with a target application with its own small labeled dataset. As in any search engine that serves information to possibly numerous users, we want the online computation performed by the dataserver to be minimal. The dataserver represents large datasets with a much more compact mixture-of experts model, and employs it to perform data search in a series of dataserver-client transactions at a low computational cost. We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets and tasks such as image classification, object detection and instance segmentation. Our Neural Data Server is available as a web-service at http://aidemos.cs.toronto.edu/nds/, recommending data to users with the aim to improve performance of their A.I. application.

READ FULL TEXT

page 2

page 8

page 9

page 10

page 11

research
06/19/2022

Scalable Neural Data Server: A Data Recommender for Transfer Learning

Absence of large-scale labeled data in the practitioner's target domain ...
research
12/03/2018

A Hybrid Instance-based Transfer Learning Method

In recent years, supervised machine learning models have demonstrated tr...
research
05/02/2018

Exploring the Limits of Weakly Supervised Pretraining

State-of-the-art visual perception models for a wide range of tasks rely...
research
06/16/2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

Transferring the knowledge learned from large scale datasets (e.g., Imag...
research
07/09/2020

n-Reference Transfer Learning for Saliency Prediction

Benefiting from deep learning research and large-scale datasets, salienc...
research
06/12/2023

Generating Synthetic Datasets by Interpolating along Generalized Geodesics

Data for pretraining machine learning models often consists of collectio...
research
03/25/2022

The TerraByte Client: providing access to terabytes of plant data

In this paper we demonstrate the TerraByte Client, a software to downloa...

Please sign up or login with your details

Forgot password? Click here to reset