b'Samy Bengio'

research

∙ 09/21/2023

Boolformer: Symbolic Regression of Logic Functions with Transformers

In this work, we introduce Boolformer, the first Transformer architectur...

0 Stéphane d'Ascoli, et al. ∙

research

∙ 06/12/2023

Transformers learn through gradual rank increase

We identify incremental learning dynamics in transformers, where the dif...

0 Enric Boix-Adserà, et al. ∙

research

∙ 01/30/2023

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

This paper considers the learning of logical (Boolean) functions with fo...

0 Emmanuel Abbe, et al. ∙

research

∙ 11/11/2022

Continuous Soft Pseudo-Labeling in ASR

Continuous pseudo-labeling (PL) algorithms such as slimIPL have recently...

0 Tatiana Likhomanenko, et al. ∙

research

∙ 10/17/2022

Continuous Pseudo-Labeling from the Start

Self-training (ST), or pseudo-labeling has sparked significant interest ...

0 Dan Berrebbi, et al. ∙

research

∙ 05/26/2022

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

This paper considers the Pointer Value Retrieval (PVR) benchmark introdu...

90 Emmanuel Abbe, et al. ∙

research

∙ 07/27/2021

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

The successes of deep learning critically rely on the ability of neural ...

16 Chiyuan Zhang, et al. ∙

research

∙ 02/19/2021

Training cascaded networks for speeded decisions using a temporal-difference loss

Although deep feedforward neural networks share some characteristics wit...

8 Michael L. Iuzzolino, et al. ∙

research

∙ 12/14/2020

NeurIPS 2020 Competition: Predicting Generalization in Deep Learning

Understanding generalization in deep learning is arguably one of the mos...

0 Yiding Jiang, et al. ∙

research

∙ 11/05/2020

Data Augmentation via Structured Adversarial Perturbations

Data augmentation is a major component of many machine learning methods ...

0 Calvin Luo, et al. ∙

research

∙ 10/06/2020

Characterising Bias in Compressed Models

The popularity and widespread use of pruning and quantization is driven ...

0 Sara Hooker, et al. ∙

research

∙ 01/14/2020

Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders

It has been of increasing interest in the field to develop automatic mac...

0 Yang Li, et al. ∙

research

∙ 12/04/2019

Fantastic Generalization Measures and Where to Find Them

Generalization of deep networks has been of great interest in recent yea...

23 Yiding Jiang, et al. ∙

research

∙ 09/19/2019

Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

An important research direction in machine learning has centered around ...

5 Aniruddh Raghu, et al. ∙

research

∙ 07/24/2019

Efficient Exploration with Self-Imitation Learning via Trajectory-Conditioned Policy

This paper proposes a method for learning a trajectory-conditioned polic...

1 Yijie Guo, et al. ∙

research

∙ 06/11/2019

Parallel Scheduled Sampling

Auto-regressive models are widely used in sequence generation problems. ...

0 Daniel Duckworth, et al. ∙

research

∙ 06/10/2019

A Closed-Form Learned Pooling for Deep Classification Networks

In modern computer vision tasks, convolutional neural networks (CNNs) ar...

0 Vighnesh Birodkar, et al. ∙

research

∙ 05/20/2019

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Graph convolutional network (GCN) has been successfully applied to many ...

0 Wei-Lin Chiang, et al. ∙

research

∙ 03/04/2019

Do Neural Networks Show Gestalt Phenomena? An Exploration of the Law of Closure

One characteristic of human visual perception is the presence of "Gestal...

8 Been Kim, et al. ∙

research

∙ 02/14/2019

Transfusion: Understanding Transfer Learning with Applications to Medical Imaging

With the increasingly varied applications of deep learning, transfer lea...

26 Jojo Yun, et al. ∙

research

∙ 02/13/2019

Identity Crisis: Memorization and Generalization under Extreme Overparameterization

We study the interplay between memorization and generalization of overpa...

26 Chiyuan Zhang, et al. ∙

research

∙ 02/06/2019

Are All Layers Created Equal?

Understanding learning and generalization of deep architectures has been...

0 Chiyuan Zhang, et al. ∙

research

∙ 01/29/2019

Semantic Redundancies in Image-Classification Datasets: The 10 Don't Need

Large datasets have been crucial to the success of deep learning models ...

0 Vighnesh Birodkar, et al. ∙

research

∙ 01/25/2019

Unsupervised speech representation learning using WaveNet autoencoders

We consider the task of unsupervised extraction of meaningful latent rep...

12 Jan Chorowski, et al. ∙

research

∙ 11/27/2018

GaterNet: Dynamic Filter Selection in Convolutional Neural Network via a Dedicated Global Gating Network

The concept of conditional computation for deep nets has been proposed p...

0 Zhourong Chen, et al. ∙

research

∙ 11/03/2018

Content preserving text generation with attribute controls

In this work, we address the problem of modifying textual attributes of ...

0 Lajanugen Logeswaran, et al. ∙

research

∙ 10/23/2018

Area Attention

Existing attention mechanisms, are mostly item-based in that a model is ...

4 Yang Li, et al. ∙

research

∙ 09/28/2018

Predicting the Generalization Gap in Deep Networks with Margin Distributions

As shown in recent research, deep neural networks can perfectly fit rand...

0 Yiding Jiang, et al. ∙

research

∙ 06/14/2018

Insights on representational similarity in neural networks with canonical correlation

Comparing different neural network representations and determining how r...

0 Ari S. Morcos, et al. ∙

research

∙ 04/18/2018

A Study on Overfitting in Deep Reinforcement Learning

Recent years have witnessed significant progresses in deep Reinforcement...

0 Chiyuan Zhang, et al. ∙

research

∙ 03/31/2018

Adversarial Attacks and Defences Competition

To accelerate research on adversarial examples and robustness of machine...

0 Alexey Kurakin, et al. ∙

research

∙ 03/16/2018

Tensor2Tensor for Neural Machine Translation

Tensor2Tensor is a library for deep learning models that is well-suited ...

0 Ashish Vaswani, et al. ∙

research

∙ 03/15/2018

Large Margin Deep Networks for Classification

We present a formulation of deep learning that aims at producing a large...

0 Gamaleldin F. Elsayed, et al. ∙

research

∙ 03/13/2018

Predicting Human Performance in Vertical Menu Selection Using Deep Learning

Predicting human performance in interaction tasks allows designers or de...

0 Yang Li, et al. ∙

research

∙ 03/09/2018

Fast Decoding in Sequence Models using Discrete Latent Variables

Autoregressive sequence models based on deep neural networks, such as RN...

0 Łukasz Kaiser, et al. ∙

research

∙ 01/29/2018

Discrete Autoencoders for Sequence Models

Recurrent models for sequences have been recently successful at many tas...

0 Łukasz Kaiser, et al. ∙

research

∙ 12/22/2017

On Using Backpropagation for Speech Texture Generation and Voice Conversion

Inspired by recent work on neural network image generation which rely on...

0 Jan Chorowski, et al. ∙

research

∙ 06/13/2017

Device Placement Optimization with Reinforcement Learning

The past few years have witnessed a growth in size and computational req...

0 Azalia Mirhoseini, et al. ∙

research

∙ 03/31/2017

N-gram Language Modeling using Recurrent Neural Network Estimation

We investigate the effective memory depth of RNN models by using them fo...

0 Ciprian Chelba, et al. ∙

research

∙ 03/29/2017

Tacotron: Towards End-to-End Speech Synthesis

A text-to-speech synthesis system typically consists of multiple stages,...

0 Yuxuan Wang, et al. ∙

research

∙ 03/15/2017

Sharp Minima Can Generalize For Deep Nets

Despite their overwhelming capacity to overfit, deep learning architectu...

0 Laurent Dinh, et al. ∙

research

∙ 01/11/2017

Context-aware Captions from Context-agnostic Supervision

We introduce an inference technique to produce discriminative context-aw...

0 Ramakrishna Vedantam, et al. ∙

research

∙ 11/29/2016

Neural Combinatorial Optimization with Reinforcement Learning

This paper presents a framework to tackle combinatorial optimization pro...

0 Irwan Bello, et al. ∙

research

∙ 11/04/2016

Adversarial Machine Learning at Scale

Adversarial examples are malicious inputs designed to fool machine learn...

0 Alexey Kurakin, et al. ∙

research

∙ 10/27/2016

Can Active Memory Replace Attention?

Several mechanisms to focus attention of a neural network on selected pa...

0 Łukasz Kaiser, et al. ∙

research

∙ 09/21/2016

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Automatically describing the content of an image is a fundamental proble...

0 Oriol Vinyals, et al. ∙

research

∙ 07/08/2016

Adversarial examples in the physical world

Most existing machine learning classifiers are highly vulnerable to adve...

0 Alexey Kurakin, et al. ∙

research

∙ 05/27/2016

Density estimation using Real NVP

Unsupervised learning of probabilistic models is a central yet challengi...

0 Laurent Dinh, et al. ∙

research

∙ 04/04/2016

Revisiting Distributed Synchronous SGD

Distributed training of deep learning models on large-scale training dat...

0 Jianmin Chen, et al. ∙

research

∙ 11/19/2015

Order Matters: Sequence to sequence for sets

Sequences have become first class citizens in supervised learning thanks...

0 Oriol Vinyals, et al. ∙

Samy Bengio

Featured Co-authors

Sign in with Google

Consider DeepAI Pro