Tanner Fiez

is this you? claim profile


  • Data-Driven Spatio-Temporal Analysis of Curbside Parking Demand: A Case-Study in Seattle

    Due to rapid expansion of urban areas in recent years, management of curbside parking has become increasingly important. To mitigate congestion, while meeting a city's diverse needs, performance-based pricing schemes have received a significant amount of attention. However, several recent studies suggest location, time-of-day, and awareness of policies are the primary factors that drive parking decisions. In light of this, we provide an extensive data-driven study of the spatio-temporal characteristics of curbside parking. This work advances the understanding of where and when to set pricing policies, as well as where to target information and incentives to drivers looking to park. Harnessing data provided by the Seattle Department of Transportation, we develop a Gaussian mixture model based technique to identify zones with similar spatial parking demand as quantified by spatial autocorrelation. In support of this technique, we introduce a metric based on the repeatability of our Gaussian mixture model to investigate temporal consistency.

    12/02/2017 ∙ by Tanner Fiez, et al. ∙ 0 share

    read it

  • Incentives in the Dark: Multi-armed Bandits for Evolving Users with Unknown Type

    Design of incentives or recommendations to users is becoming more common as platform providers continually emerge. We propose a multi-armed bandit approach to the problem in which users types are unknown a priori and evolve dynamically in time. Unlike the traditional bandit setting, observed rewards are generated by a single Markov process. We demonstrate via an illustrative example that blindly applying the traditional bandit algorithms results in very poor performance as measured by regret. We introduce two variants of classical bandit algorithms, upper confidence bound (UCB) and epsilon-greedy, for which we provide theoretical bounds on the regret. We conduct a number of simulation-based experiments to show how the algorithms perform in comparison to traditional UCB and epsilon-greedy algorithms as well as reinforcement learning (Q-learning).

    03/11/2018 ∙ by Lillian J. Ratliff, et al. ∙ 0 share

    read it

  • Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

    The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge. We propose a multi-armed bandit framework for matching incentives to users, whose preferences are unknown a priori and evolving dynamically in time, in a resource constrained environment. We design an algorithm that combines ideas from three distinct domains: (i) a greedy matching paradigm, (ii) the upper confidence bound algorithm (UCB) for bandits, and (iii) mixing times from the theory of Markov chains. For this algorithm, we provide theoretical bounds on the regret and demonstrate its performance via both synthetic and realistic (matching supply and demand in a bike-sharing platform) examples.

    07/06/2018 ∙ by Tanner Fiez, et al. ∙ 0 share

    read it

  • Adaptive Incentive Design

    We apply control theoretic and optimization techniques to adaptively design incentives. In particular, we consider the problem of a planner with an objective that depends on data from strategic decision makers. The planner does not know the process by which the strategic agents make decisions. Under the assumption that the agents are utility maximizers, we model their interactions as a non-cooperative game and utilize the Nash equilibrium concept as well as myopic update rules to model the selection of their decision. By parameterizing the agents' utility functions and the incentives offered, we develop an algorithm that the planner can employ to learn the agents' decision-making processes while simultaneously designing incentives to change their response to a more desirable response from the planner's perspective. We provide convergence results for this algorithm both in the noise-free and noisy cases and present illustrative examples.

    06/14/2018 ∙ by Lillian J. Ratliff, et al. ∙ 0 share

    read it

  • Sequential Experimental Design for Transductive Linear Bandits

    In this paper we introduce the transductive linear bandit problem: given a set of measurement vectors X⊂R^d, a set of items Z⊂R^d, a fixed confidence δ, and an unknown vector θ^∗∈R^d, the goal is to infer argmax_z∈Z z^θ^∗ with probability 1-δ by making as few sequentially chosen noisy measurements of the form x^θ^∗ as possible. When X=Z, this setting generalizes linear bandits, and when X is the standard basis vectors and Z⊂{0,1}^d, combinatorial bandits. Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost. As an example, in drug discovery the compounds and dosages X a practitioner may be willing to evaluate in the lab in vitro due to cost or safety reasons may differ vastly from those compounds and dosages Z that can be safely administered to patients in vivo. Alternatively, in recommender systems for books, the set of books X a user is queried about may be restricted to well known best-sellers even though the goal might be to recommend more esoteric titles Z. In this paper, we provide instance-dependent lower bounds for the transductive setting, an algorithm that matches these up to logarithmic factors, and an evaluation. In particular, we provide the first non-asymptotic algorithm for linear bandits that nearly achieves the information theoretic lower bound.

    06/20/2019 ∙ by Tanner Fiez, et al. ∙ 0 share

    read it

  • Convergence of Learning Dynamics in Stackelberg Games

    This paper investigates the convergence of learning dynamics in Stackelberg games. In the class of games we consider, there is a hierarchical game being played between a leader and a follower with continuous action spaces. We show that in zero-sum games, the only stable attractors of the Stackelberg gradient dynamics are Stackelberg equilibria. This insight allows us to develop a gradient-based update for the leader that converges to Stackelberg equilibria in zero-sum games and the set of stable attractors in general-sum games. We then consider a follower employing a gradient-play update rule instead of a best response strategy and propose a two-timescale algorithm with similar asymptotic convergence results. For this algorithm, we also provide finite-time high probability bounds for local convergence to a neighborhood of a stable Stackelberg equilibrium in general-sum games.

    06/04/2019 ∙ by Tanner Fiez, et al. ∙ 0 share

    read it