
Online Markov Decision Processes with Aggregate Bandit Feedback
We study a novel variant of online finitehorizon Markov Decision Proces...
read it

Separating Adaptive Streaming from Oblivious Streaming
We present a streaming problem for which every adversariallyrobust stre...
read it

Learning Adversarial Markov Decision Processes with Delayed Feedback
Reinforcement learning typically assumes that the agent observes feedbac...
read it

Adversarial Dueling Bandits
We introduce the problem of regret minimization in Adversarial Dueling B...
read it

Kidney exchange and endless paths: On the optimal use of an altruistic donor
We consider a wellstudied online random graph model for kidney exchange...
read it

The Sparse Vector Technique, Revisited
We revisit one of the most basic and widely applicable techniques in the...
read it

OracleEfficient Reinforcement Learning in Factored MDPs with Unknown Structure
We consider provablyefficient reinforcement learning (RL) in nonepisod...
read it

Beyond Individual and Group Fairness
We present a new datadriven model of fairness that, unlike existing sta...
read it

Detecting malicious PDF using CNN
Malicious PDF files represent one of the biggest threats to computer sec...
read it

A Theory of MultipleSource Adaptation with Limited Target Labeled Data
We study multiplesource domain adaptation, when the learner has access ...
read it

Adversarial Stochastic Shortest Path
Stochastic shortest path (SSP) is a wellknown problem in planning and c...
read it

Reinforcement Learning with Feedback Graphs
We study episodic reinforcement learning in Markov decision processes wh...
read it

Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity
We present a differentially private learner for halfspaces over a finite...
read it

Adversarially Robust Streaming Algorithms via Differential Privacy
A streaming algorithm is said to be adversarially robust if its accuracy...
read it

Three Approaches for Personalization with Applications to Federated Learning
The standard objective in machine learning is to train a single model fo...
read it

Prediction with Corrupted Expert Advice
We revisit the fundamental problem of prediction with expert advice, in ...
read it

Nearoptimal Regret Bounds for Stochastic Shortest Path
Stochastic shortest path (SSP) is a wellknown problem in planning and c...
read it

Privately Learning Thresholds: Closing the Exponential Gap
We study the sample complexity of learning threshold functions under the...
read it

Apprenticeship Learning via FrankWolfe
We consider the applications of the FrankWolfe (FW) algorithm for Appre...
read it

The Role of Apriori Information in Networks of Rational Agents
Until now, distributed algorithms for rational agents have assumed apri...
read it

Individual Regret in Cooperative Nonstochastic MultiArmed Bandits
We study agents communicating over an underlying network by exchanging m...
read it

Online Revenue Maximization for Server Pricing
Efficient and truthful mechanisms to price resources on remote servers/m...
read it

Thompson Sampling for Adversarial Bit Prediction
We study the Thompson sampling algorithm in an adversarial setting, spec...
read it

Graphbased Discriminators: Sample Complexity and Expressiveness
A basic question in learning theory is to identify if two distributions ...
read it

Combinatorial Bandits with FullBandit Feedback: Sample Complexity and Regret Minimization
Combinatorial Bandits generalize multiarmed bandits, where k out of n a...
read it

Repeated A/B Testing
We study a setting in which a learner faces a sequence of A/B tests and ...
read it

Efficient candidate screening under multiple tests and implications for fairness
When recruiting job candidates, employers rarely observe their underlyin...
read it

Average reward reinforcement learning with unknown mixing times
We derive and analyze learning algorithms for policy evaluation, apprent...
read it

Online Convex Optimization in Adversarial Markov Decision Processes
We consider online learning in episodic loopfree Markov decision proces...
read it

Competitive ratio versus regret minimization: achieving the best of both worlds
We consider online algorithms under both the competitive ratio criteria ...
read it

Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies
We consider a settings of hierarchical reinforcement learning, in which ...
read it

Learning LinearQuadratic Regulators Efficiently with only √(T) Regret
We present the first computationallyefficient algorithm with O(√(T)) r...
read it

Differentially Private Learning of Geometric Concepts
We present differentially private efficient algorithms for learning unio...
read it

Learning and Generalization for Matching Problems
We study a classic algorithmic problem through the lens of statistical l...
read it

Optimal Algorithm for Bayesian IncentiveCompatible
We consider a social planner faced with a stream of myopic selfish agent...
read it

Adversarial Online Learning with noise
We present and study models of adversarial online learning where the fee...
read it

Improved generalization bounds for robust learning
We consider a model of robust learning in an adversarial environment. Th...
read it

Online Linear Quadratic Control
We study the problem of controlling linear timeinvariant systems with k...
read it

Are All Experts Equally Good? A Study of Analyst Earnings Estimates
We investigate whether experts possess differential expertise when makin...
read it

Fair Leader Election for Rational Agents in Asynchronous Rings and Networks
We study a game theoretic model where a coalition of processors might co...
read it

Planning and Learning with Stochastic Action Sets
In many practical uses of reinforcement learning (RL) the set of actions...
read it

Flow Equilibria via Online Surge Pricing
We explore issues of dynamic supply and demand in ride sharing services ...
read it

Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies
In this work, we provide theoretical guarantees for reward decomposition...
read it

Are Two (Samples) Really Better Than One? On the NonAsymptotic Performance of Empirical Revenue Maximization
The literature on "mechanism design from samples," which has flourished ...
read it

The Strategy of Experts for Repeated Predictions
We investigate the behavior of experts who seek to make predictions with...
read it

Automatic Representation for Lifetime Value Recommender Systems
Many modern commercial sites employ recommender systems to propose relev...
read it

On the Complexity of Learning with Kernels
A wellrecognized limitation of kernel learning is the requirement to ha...
read it

Nonstochastic MultiArmed Bandits with GraphStructured Feedback
We present and study a partialinformation model of online learning, whe...
read it

From Bandits to Experts: A Tale of Domination and Independence
We consider the partial observability model for multiarmed bandits, int...
read it

An InformationTheoretic Analysis of Hard and Soft Assignment Methods for Clustering
Assignment methods are at the heart of many algorithms for unsupervised ...
read it
Yishay Mansour
verfied profile
Professor of Computer Science at Tel Aviv University