
-
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Pretraining on large labeled datasets is a prerequisite to achieve good ...
read it
-
Beyond English-Centric Multilingual Machine Translation
Existing work in translation demonstrated the potential of massively mul...
read it
-
Target Conditioning for One-to-Many Generation
Neural Machine Translation (NMT) models often lack diversity in their ge...
read it
-
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
Unsupervised image representations have significantly reduced the gap wi...
read it
-
Training with Quantization Noise for Extreme Model Compression
We tackle the problem of producing compact models, maximizing their accu...
read it
-
Training with Quantization Noise for Extreme Fixed-Point Compression
We tackle the problem of producing compact models, maximizing their accu...
read it
-
Learning to Visually Navigate in Photorealistic Environments Without any Supervision
Learning to navigate in a realistic setting where an agent must rely sol...
read it
-
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory
Transformers are feedforward networks that can process input tokens in p...
read it
-
Unsupervised pretraining transfers well across languages
Cross-lingual and multi-lingual training of Automatic Speech Recognition...
read it
-
Pruning Convolutional Neural Networks with Self-Supervision
Convolutional neural networks trained without supervision come close to ...
read it
-
Libri-Light: A Benchmark for ASR with Limited or No Supervision
We introduce a new collection of spoken English audio suitable for train...
read it
-
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
We show that margin-based bitext mining in a multilingual sentence space...
read it
-
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Pre-training text representations have led to significant improvements i...
read it
-
Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment
In this paper, we focus on the problem of adapting word vector-based mod...
read it
-
Reducing Transformer Depth on Demand with Structured Dropout
Overparameterized transformer networks have obtained state of the art re...
read it
-
Why Build an Assistant in Minecraft?
In this document we describe a rationale for a research program aimed at...
read it
-
And the Bit Goes Down: Revisiting the Quantization of Neural Networks
In this paper, we address the problem of reducing the memory footprint o...
read it
-
Augmenting Self-attention with Persistent Memory
Transformer networks have lead to important progress in language modelin...
read it
-
Adaptive Attention Span in Transformers
We propose a novel self-attention mechanism that can learn its optimal a...
read it
-
Leveraging Large-Scale Uncurated Data for Unsupervised Pre-training of Visual Features
Pre-training general-purpose visual features with convolutional neural n...
read it
-
Cooperative Learning of Disjoint Syntax and Semantics
There has been considerable attention devoted to models that learn to jo...
read it
-
Unsupervised Hyperalignment for Multilingual Word Embeddings
We consider the problem of aligning continuous word representations, lea...
read it
-
Deep Clustering for Unsupervised Learning of Visual Features
Clustering is a class of unsupervised learning methods that has been ext...
read it
-
Unsupervised Alignment of Embeddings with Wasserstein Procrustes
We consider the task of aligning two sets of points in high dimension, w...
read it
-
Improving Supervised Bilingual Mapping of Word Embeddings
Continuous word representations, learned on different languages, can be ...
read it
-
Learning Word Vectors for 157 Languages
Distributed word representations, or word vectors, have recently been ap...
read it
-
Advances in Pre-Training Distributed Word Representations
Many Natural Language Processing applications nowadays rely on pre-train...
read it
-
Unbounded cache model for online language modeling with open vocabulary
Recently, continuous cache models were proposed as extensions to recurre...
read it
-
Fast Linear Model for Knowledge Graph Embeddings
This paper shows that a simple baseline based on a Bag-of-Words (BoW) re...
read it
-
Optimizing the Latent Space of Generative Networks
Generative Adversarial Networks (GANs) have been shown to be able to sam...
read it
-
Unsupervised Learning by Predicting Noise
Convolutional neural networks provide visual features that perform remar...
read it
-
CommAI: Evaluating the first steps towards a useful general AI
With machine learning successfully applied to new daunting problems almo...
read it
-
Learning Visual N-Grams from Web Data
Real-world image recognition systems need to recognize tens of thousands...
read it
-
Improving Neural Language Models with a Continuous Cache
We propose an extension to neural network language models to adapt their...
read it
-
FastText.zip: Compressing text classification models
We consider the problem of producing compact architectures for text clas...
read it
-
Variable Computation in Recurrent Neural Networks
Recurrent neural networks (RNNs) have been used extensively and with inc...
read it
-
Efficient softmax approximation for GPUs
We propose an approximate strategy to efficiently train neural network b...
read it
-
Enriching Word Vectors with Subword Information
Continuous word representations, trained on large unlabeled corpora are ...
read it
-
Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classificat...
read it
-
Revisiting Visual Question Answering Baselines
Visual question answering (VQA) is an interesting learning setting for e...
read it
-
Locally-Optimized Inter-Subject Alignment of Functional Cortical Regions
Inter-subject registration of cortical areas is necessary in functional ...
read it
-
A Roadmap towards Machine Intelligence
The development of intelligent machines is one of the biggest unsolved c...
read it
-
Learning Simple Algorithms from Examples
We present an approach for learning simple algorithms such as copying, m...
read it
-
Alternative structures for character-level RNNs
Recurrent neural networks are convenient and efficient models for langua...
read it
-
Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
Despite the recent achievements in machine learning, we are still very f...
read it
-
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
One long-term goal of machine learning research is to produce methods th...
read it
-
Learning Longer Memory in Recurrent Neural Networks
Recurrent neural network is a powerful model that learns temporal patter...
read it
-
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
We introduce a model for bidirectional retrieval of images and sentences...
read it
-
A Convex Relaxation for Weakly Supervised Classifiers
This paper introduces a general multi-class approach to weakly supervise...
read it