On the design of autonomous agents from multiple data sources
This paper is concerned with the problem of designing agents able to dynamically select information from multiple data sources in order to tackle tasks that involve tracking a target behavior while optimizing a reward. We formulate this problem as a data-driven optimal control problem with integer decision variables and give an explicit expression for its solution. The solution determines how (and when) the data from the sources should be used by the agent. We also formalize a notion of agent’s regret and, by relaxing the problem, give a regret upper bound. Simulations complement the results.
READ FULL TEXT