Devansh Arpit

research

∙ 08/11/2023

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

The massive successes of large language models (LLMs) encourage the emer...

0 Zhiwei Liu, et al. ∙

research

∙ 08/04/2023

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Recent months have seen the emergence of a powerful new trend in which l...

0 Weiran Yao, et al. ∙

research

∙ 07/18/2023

REX: Rapid Exploration and eXploitation for AI Agents

In this paper, we propose an enhanced approach for Rapid Exploration and...

0 Rithesh Murthy, et al. ∙

research

∙ 03/10/2023

On the Unlikelihood of D-Separation

Causal discovery aims to recover a causal graph from data generated by i...

0 Itai Feigenbaum, et al. ∙

research

∙ 10/21/2021

Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

In Domain Generalization (DG) settings, models trained on a given set of...

0 Devansh Arpit, et al. ∙

research

∙ 10/19/2021

Momentum Contrastive Autoencoder: Using Contrastive Learning for Latent Space Distribution Matching in WAE

Wasserstein autoencoder (WAE) shows that matching two distributions is e...

0 Devansh Arpit, et al. ∙

research

∙ 10/19/2021

Learning Rich Nearest Neighbor Representations from Self-supervised Ensembles

Pretraining convolutional neural networks via self-supervision, and appl...

0 Bram Wallace, et al. ∙

research

∙ 09/20/2021

Merlion: A Machine Learning Library for Time Series

We introduce Merlion, an open-source machine learning library for time s...

78 Aadyot Bhatnagar, et al. ∙

research

∙ 12/28/2020

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

The early phase of training has been shown to be important in two ways f...

16 Stanisław Jastrzębski, et al. ∙

research

∙ 02/21/2020

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

The early phase of training of deep neural networks is critical for thei...

18 Stanisław Jastrzębski, et al. ∙

research

∙ 02/20/2020

Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning

We introduce a parameterization method called Neural Bayes which allows ...

36 Devansh Arpit, et al. ∙

research

∙ 10/01/2019

Entropy Penalty: Towards Generalization Beyond the IID Assumption

It has been shown that instead of learning actual object features, deep ...

11 Devansh Arpit, et al. ∙

research

∙ 06/05/2019

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Residual networks (ResNet) and weight normalization play an important ro...

15 Devansh Arpit, et al. ∙

research

∙ 01/11/2019

The Benefits of Over-parameterization at Initialization in Deep ReLU Networks

It has been noted in existing literature that over-parameterization in R...

17 Devansh Arpit, et al. ∙

research

∙ 10/06/2018

h-detach: Modifying the LSTM Gradient Towards Better Optimization

Recurrent neural networks are known for their notorious exploding and va...

0 Devansh Arpit, et al. ∙

research

∙ 06/22/2018

On the Spectral Bias of Deep Neural Networks

It is well known that over-parametrized deep neural networks (DNNs) are ...

0 Nasim Rahaman, et al. ∙

research

∙ 02/24/2018

A Walk with SGD

Exploring why stochastic gradient descent (SGD) based optimization metho...

0 Chen Xing, et al. ∙

research

∙ 11/15/2017

Variational Bi-LSTMs

Recurrent neural networks like long short-term memory (LSTM) are importa...

0 Samira Shabanian, et al. ∙

research

∙ 11/13/2017

Three Factors Influencing Minima in SGD

We study the properties of the endpoint of stochastic gradient descent (...

0 Stanisław Jastrzębski, et al. ∙

research

∙ 10/31/2017

Fraternal Dropout

Recurrent neural networks (RNNs) are important class of architectures am...

0 Konrad Zolna, et al. ∙

research

∙ 10/13/2017

Residual Connections Encourage Iterative Inference

Residual networks (Resnets) have become a prominent architecture in deep...

0 Stanisław Jastrzębski, et al. ∙

research

∙ 06/16/2017

A Closer Look at Memorization in Deep Networks

We examine the role of memorization in deep learning, drawing connection...

0 Devansh Arpit, et al. ∙

research

∙ 05/23/2016

On Optimality Conditions for Auto-Encoder Signal Recovery

Auto-Encoders are unsupervised models that aim to learn patterns from ob...

0 Devansh Arpit, et al. ∙

research

∙ 03/04/2016

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

While the authors of Batch Normalization (BN) identify and address an im...

0 Devansh Arpit, et al. ∙

research

∙ 05/21/2015

Why Regularized Auto-Encoders learn Sparse Representation?

While the authors of Batch Normalization (BN) identify and address an im...

0 Devansh Arpit, et al. ∙

research

∙ 12/07/2014

Dimensionality Reduction with Subspace Structure Preservation

Modeling data as being sampled from a union of independent subspaces has...

0 Devansh Arpit, et al. ∙

research

∙ 05/06/2014

Is Joint Training Better for Deep Auto-Encoders?

Traditionally, when generative models of data are developed via deep arc...

0 Yingbo Zhou, et al. ∙

research

∙ 01/17/2014

An Analysis of Random Projections in Cancelable Biometrics

With increasing concerns about security, the need for highly secure phys...

0 Devansh Arpit, et al. ∙

Devansh Arpit

Featured Co-authors

Sign in with Google

Consider DeepAI Pro