Kenshi Abe

research

∙ 07/13/2023

Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative

Dialog policies, which determine a system's action based on the current ...

0 Sho Shimoyama, et al. ∙

research

∙ 05/26/2023

A Slingshot Approach to Learning in Monotone Games

In this paper, we address the problem of computing equilibria in monoton...

0 Kenshi Abe, et al. ∙

research

∙ 05/23/2023

Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games

Learning in games considers how multiple agents maximize their own rewar...

0 Yuma Fujimoto, et al. ∙

research

∙ 05/02/2023

Exploration of Unranked Items in Safe Online Learning to Re-Rank

Bandit algorithms for online learning to rank (OLTR) problems often aim ...

0 Hiroaki Shiino, et al. ∙

research

∙ 02/02/2023

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

Repeated games consider a situation where multiple agents are motivated ...

0 Yuma Fujimoto, et al. ∙

research

∙ 09/09/2022

Fair Matrix Factorisation for Large-Scale Recommender Systems

Modern recommender systems are hedged with various requirements, such as...

0 Riku Togashi, et al. ∙

research

∙ 08/21/2022

Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

The theory of learning in games is prominent in the AI community, motiva...

4 Kenshi Abe, et al. ∙

research

∙ 06/18/2022

Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

In this study, we consider a variant of the Follow the Regularized Leade...

9 Kenshi Abe, et al. ∙

research

∙ 06/02/2022

Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes

Policy gradient (PG) is a reinforcement learning (RL) approach that opti...

10 Tetsuro Morimura, et al. ∙

research

∙ 02/14/2022

Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search

This paper considers the capacity expansion problem in two-sided matchin...

10 Kenshi Abe, et al. ∙

research

∙ 10/23/2020

A Practical Guide of Off-Policy Evaluation for Bandit Problems

Off-policy evaluation (OPE) is the problem of estimating the value of a ...

13 Masahiro Kato, et al. ∙

research

∙ 10/22/2020

Thresholded LASSO Bandit

In this paper, we revisit sparse stochastic contextual linear bandits. I...

2 Kaito Ariu, et al. ∙

research

∙ 07/04/2020

Off-Policy Exploitability-Evaluation and Equilibrium-Learning in Two-Player Zero-Sum Markov Games

Off-policy evaluation (OPE) is the problem of evaluating new policies us...

0 Kenshi Abe, et al. ∙

research

∙ 11/18/2019

A Simple Heuristic for Bayesian Optimization with A Low Budget

The aim of black-box optimization is to optimize an objective function w...

0 Masahiro Nomura, et al. ∙

Kenshi Abe

Featured Co-authors

Sign in with Google

Consider DeepAI Pro