Automated design of analog and radio-frequency circuits using supervised...
Large language models (LLMs) are being applied as actors for sequential
...
In temporal-difference reinforcement learning algorithms, variance in va...
Robust reinforcement learning (RL) considers the problem of learning pol...
In competitive two-agent environments, deep reinforcement learning (RL)
...
Generalization to out of distribution tasks in reinforcement learning is...
Policy space response oracles (PSRO) is a multi-agent reinforcement lear...
Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in
...
Maximum Entropy Reinforcement Learning (MaxEnt RL) algorithms such as So...
Temporal-Difference (TD) learning methods, such as Q-Learning, have prov...
Multi-agent reinforcement learning has been successfully applied to
full...
Natural language instruction following tasks serve as a valuable test-be...
Machine learning algorithms often make decisions on behalf of agents wit...
Policy Space Response Oracles (PSRO) is a deep reinforcement learning
al...
A* search is an informed search algorithm that uses a heuristic function...
Finding approximate Nash equilibria in zero-sum imperfect-information ga...
Autonomous agents can learn by imitating teacher demonstrations of the
i...
Generalizing manipulation skills to new situations requires extracting
i...
Learning from human demonstrations can facilitate automation but is risk...
Learning to accomplish tasks such as driving, grasping or surgery from
s...
Reinforcement learning (RL) algorithms involve the deep nesting of disti...
An option is a short-term skill consisting of a control policy for a
spe...
It is well known that options can make planning more efficient, among th...
In Passive POMDPs actions do not affect the world state, but still incur...