András György

research

∙ 05/18/2023

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL

While policy optimization algorithms have played an important role in re...

0 Qinghua Liu, et al. ∙

research

∙ 05/07/2023

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

The emergence of large language models has led to the development of pow...

0 Hazem Ibrahim, et al. ∙

research

∙ 02/10/2023

A Second-Order Method for Stochastic Bandit Convex Optimisation

We introduce a simple and efficient algorithm for unconstrained zeroth-o...

0 Tor Lattimore, et al. ∙

research

∙ 12/23/2022

Generalization Bounds for Transfer Learning with Pretrained Classifiers

We study the ability of foundation models to learn representations for c...

4 Tomer Galanti, et al. ∙

research

∙ 12/06/2022

Understanding Self-Predictive Learning for Reinforcement Learning

We study the learning dynamics of self-predictive learning for reinforce...

0 Yunhao Tang, et al. ∙

research

∙ 10/27/2022

Confident Approximate Policy Iteration for Efficient Local Planning in q^π-realizable MDPs

We consider approximate dynamic programming in γ-discounted Markov decis...

0 Gellért Weisz, et al. ∙

research

∙ 05/26/2022

Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost

We study distributed contextual linear bandits with stochastic contexts,...

0 Sanae Amani, et al. ∙

research

∙ 02/25/2022

Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

We study a sequential decision problem where the learner faces a sequenc...

0 MohammadJavad Azizi, et al. ∙

research

∙ 01/17/2022

A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits

We study the non-stationary stochastic multi-armed bandit problem, where...

0 Yasin Abbasi-Yadkori, et al. ∙

research

∙ 12/30/2021

On the Role of Neural Collapse in Transfer Learning

We study the ability of foundation models to learn representations for c...

9 Tomer Galanti, et al. ∙

research

∙ 10/05/2021

TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions

We consider the minimax query complexity of online planning with a gener...

0 Gellért Weisz, et al. ∙

research

∙ 06/30/2021

Learning to Minimize Age of Information over an Unreliable Channel with Energy Harvesting

The time average expected age of information (AoI) is studied for status...

9 Elif Tuğçe Ceran, et al. ∙

research

∙ 06/15/2021

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Many advances that have improved the robustness and efficiency of deep r...

0 Abbas Abdolmaleki, et al. ∙

research

∙ 04/02/2021

Defending Against Image Corruptions Through Adversarial Augmentations

Modern neural networks excel at image classification, yet they remain vu...

0 Dan A. Calian, et al. ∙

research

∙ 02/19/2021

A Reinforcement Learning Approach to Age of Information in Multi-User Networks with HARQ

Scheduling the transmission of time-sensitive information from a source ...

24 Elif Tuğçe Ceran, et al. ∙

research

∙ 02/14/2021

Perceptually Constrained Adversarial Attacks

Motivated by previous observations that the usually applied L_p norms (p...

0 Muhammad Zaid Hameed, et al. ∙

research

∙ 12/01/2020

Mutual Information Constraints for Monte-Carlo Objectives

A common failure mode of density models trained as variational autoencod...

0 Gábor Melis, et al. ∙

research

∙ 10/12/2020

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

We consider the adversarial multi-armed bandit problem under delayed fee...

0 András György, et al. ∙

research

∙ 09/25/2020

Mirror Descent and the Information Ratio

We establish a connection between the stability of mirror descent and th...

0 Tor Lattimore, et al. ∙

research

∙ 06/18/2020

Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting

We consider off-policy evaluation in the contextual bandit setting for t...

0 Ilja Kuzborskij, et al. ∙

research

∙ 06/03/2020

Non-Stationary Bandits with Intermediate Observations

Online recommender systems often face long delays in receiving feedback,...

0 Claire Vernade, et al. ∙

research

∙ 10/12/2019

Minimal Assumptions Refinement for GR(1) Specifications

Reactive synthesis is concerned with finding a correct-by-construction c...

0 Davide G. Cavezza, et al. ∙

research

∙ 05/08/2019

Meta-learning of Sequential Strategies

In this report we review memory-based meta-learning as a tool for buildi...

16 Pedro A. Ortega, et al. ∙

research

∙ 03/06/2019

Detecting Overfitting via Adversarial Examples

The repeated reuse of test sets in popular benchmark problems raises dou...

10 Roman Werpachowski, et al. ∙

research

∙ 02/27/2019

Communication without Interception: Defense against Deep-Learning-based Modulation Detection

We consider a communication scenario, in which an intruder, employing a ...

10 Muhammad Zaid Hameed, et al. ∙

research

∙ 01/24/2019

Reinforcement Learning to Minimize Age of Information with an Energy Harvesting Sensor with HARQ and Sensing Cost

The time average expected age of information (AoI) is studied for status...

0 Elif Tuğçe Ceran, et al. ∙

research

∙ 07/24/2018

Learning from Delayed Outcomes with Intermediate Observations

Optimizing for long term value is desirable in many practical applicatio...

4 Timothy A. Mann, et al. ∙

research

∙ 07/02/2018

LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration

We consider the problem of configuring general-purpose solvers to run ef...

0 Gellért Weisz, et al. ∙

research

∙ 06/11/2018

Adaptive MCMC via Combining Local Samplers

Markov chain Monte Carlo (MCMC) methods are widely used in machine learn...

0 Kiarash Shaloudegi, et al. ∙

research

∙ 06/01/2018

A Reinforcement Learning Approach to Age of Information in Multi-User Networks

Scheduling the transmission of time-sensitive data to multiple users ove...

2 Elif Tuğçe Ceran, et al. ∙

research

∙ 05/08/2018

A Weakness Measure for GR(1) Formulae

In spite of the theoretical and algorithmic developments for system synt...

0 Davide G. Cavezza, et al. ∙

research

∙ 02/08/2018

Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection

Machine learning has become an important component for many systems and ...

0 Andrea Paudice, et al. ∙

research

∙ 12/19/2017

A Reinforcement-Learning Approach to Proactive Caching in Wireless Networks

We consider a mobile user accessing contents in a dynamic environment, w...

0 Samuel O. Somuyiwa, et al. ∙

research

∙ 09/08/2017

A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds

Recently, much work has been done on extending the scope of online learn...

0 Pooria Joulani, et al. ∙

research

∙ 09/22/2016

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles

Algorithms for bandit convex optimization and online learning often rely...

0 Xiaowei Hu, et al. ∙

research

∙ 09/07/2016

Chaining Bounds for Empirical Risk Minimization

This paper extends the standard chaining technique to prove excess risk ...

0 Gábor Balázs, et al. ∙

research

∙ 10/27/2015

Online Learning with Gaussian Payoffs and Side Observations

We consider a sequential learning problem with Gaussian payoffs and side...

0 Yifan Wu, et al. ∙

research

∙ 06/30/2015

Fast Cross-Validation for Incremental Learning

Cross-validation (CV) is one of the main tools for performance estimatio...

0 Pooria Joulani, et al. ∙

research

∙ 05/13/2014

Adaptive Monte Carlo via Bandit Allocation

We consider the problem of sequentially choosing between a set of unbias...

0 James Neufeld, et al. ∙

research

∙ 01/16/2014

Efficient Multi-Start Strategies for Local Search Algorithms

Local search algorithms applied to optimization problems often suffer fr...

0 András György, et al. ∙

research

∙ 06/04/2013

Online Learning under Delayed Feedback

Online learning with delayed feedback has received increasing attention ...

0 Pooria Joulani, et al. ∙

research

∙ 11/03/2012

Partition Tree Weighting

This paper introduces the Partition Tree Weighting technique, an efficie...

0 Joel Veness, et al. ∙

research

∙ 05/01/2012

A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning

We consider the problem of simultaneously learning to linearly combine a...

0 Arash Afkanpour, et al. ∙

András György

Featured Co-authors

Sign in with Google

Consider DeepAI Pro