
-
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Pre-trained representations are becoming crucial for many NLP and percep...
read it
-
PyGlove: Symbolic Programming for Automated Machine Learning
Neural networks are sensitive to hyper-parameter and architecture choice...
read it
-
Evolving Reinforcement Learning Algorithms
We propose a method for meta-learning reinforcement learning algorithms ...
read it
-
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
Neural networks are often over-parameterized and hence benefit from aggr...
read it
-
Pre-Training Transformers as Energy-Based Cloze Models
We introduce Electric, an energy-based cloze model for representation le...
read it
-
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Building instance segmentation models that are data-efficient and can ha...
read it
-
Towards Domain-Agnostic Contrastive Learning
Despite recent success, most contrastive self-supervised learning method...
read it
-
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
We employ a combination of recent developments in semi-supervised learni...
read it
-
Smooth Adversarial Training
It is commonly believed that networks cannot be both accurate and robust...
read it
-
Rethinking Pre-training and Self-training
Pre-training is a dominant paradigm in computer vision. For example, sup...
read it
-
AutoHAS: Differentiable Hyper-parameter and Architecture Search
Neural Architecture Search (NAS) has achieved significant progress in pu...
read it
-
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
With the success of language pretraining, it is highly desirable to deve...
read it
-
Improved Noisy Student Training for Automatic Speech Recognition
Recently, a semi-supervised learning method known as "noisy student trai...
read it
-
Chip Placement with Deep Reinforcement Learning
In this work, we present a learning-based approach to chip placement, on...
read it
-
Evolving Normalization-Activation Layers
Normalization layers and activation functions are critical components in...
read it
-
Improving 3D Object Detection through Progressive Population Based Augmentation
Data augmentation has been widely adopted for object detection in 3D poi...
read it
-
Meta Pseudo Labels
Many training algorithms of a deep neural network can be interpreted as ...
read it
-
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Masked language modeling (MLM) pre-training methods such as BERT corrupt...
read it
-
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
Machine learning research has advanced in multiple aspects, including mo...
read it
-
Towards a Human-like Open-Domain Chatbot
We present Meena, a multi-turn open-domain chatbot trained end-to-end on...
read it
-
SpecAugment on Large Scale Datasets
Recently, SpecAugment, an augmentation scheme for automatic speech recog...
read it
-
SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Convolutional neural networks typically encode an input image into a ser...
read it
-
MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices
Despite the blooming success of architecture search for vision tasks in ...
read it
-
Adversarial Examples Improve Image Recognition
Adversarial examples are commonly viewed as a threat to ConvNets. Here w...
read it
-
EfficientDet: Scalable and Efficient Object Detection
Model efficiency has become increasingly important in computer vision. I...
read it
-
Self-training with Noisy Student improves ImageNet classification
We present a simple self-training method that achieves 87.4 on ImageNet,...
read it
-
High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks
Predicting future video frames is extremely challenging, as there are ma...
read it
-
RandAugment: Practical data augmentation with no separate search
Recent work has shown that data augmentation has the potential to signif...
read it
-
Saccader: Improving Accuracy of Hard Attention Models for Vision
Although deep convolutional neural networks achieve state-of-the-art per...
read it
-
MixNet: Mixed Depthwise Convolutional Kernels
Depthwise convolution is becoming increasingly popular in modern efficie...
read it
-
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
It can be challenging to train multi-task neural networks that outperfor...
read it
-
Neural Input Search for Large Scale Recommendation Models
Recommendation problems with large numbers of discrete items, such as pr...
read it
-
Learning Data Augmentation Strategies for Object Detection
Data augmentation is a critical component of training deep learning mode...
read it
-
XLNet: Generalized Autoregressive Pretraining for Language Understanding
With the capability of modeling bidirectional contexts, denoising autoen...
read it
-
Selfie: Self-supervised Pretraining for Image Embedding
We introduce a pretraining technique called Selfie, which stands for SEL...
read it
-
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Convolutional Neural Networks (ConvNets) are commonly developed at a fix...
read it
-
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
We investigate how the final parameters found by stochastic gradient des...
read it
-
Searching for MobileNetV3
We present the next generation of MobileNets based on a combination of c...
read it
-
Unsupervised Data Augmentation
Despite its success, deep learning still needs large labeled datasets to...
read it
-
Attention Augmented Convolutional Networks
Convolutional networks have been the paradigm of choice in many computer...
read it
-
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
We present SpecAugment, a simple data augmentation method for speech rec...
read it
-
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Current state-of-the-art convolutional architectures for object detectio...
read it
-
Soft Conditional Computation
Conditional computation aims to increase the size and accuracy of a netw...
read it
-
The Evolved Transformer
Recent works have highlighted the strengths of the Transformer architect...
read it
-
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer networks have a potential of learning longer-term dependency...
read it
-
Domain Adaptive Transfer Learning with Specialist Models
Transfer learning is a widely used method to build high performing compu...
read it
-
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
GPipe is a scalable pipeline parallelism library that enables learning o...
read it
-
DropBlock: A regularization method for convolutional networks
Deep neural networks often work well when they are over-parameterized an...
read it
-
Semi-Supervised Sequence Modeling with Cross-View Training
Unsupervised representation learning algorithms such as word2vec and ELM...
read it
-
MnasNet: Platform-Aware Neural Architecture Search for Mobile
Designing convolutional neural networks (CNN) models for mobile devices ...
read it