The quality of training data impacts the performance of pre-trained larg...
Recent work has shown that language models' (LMs) prompt-based learning
...
Large language models (LLMs) exhibit in-context learning abilities which...
Specifying reward functions for complex tasks like object manipulation o...
For traffic routing platforms, the choice of which route to recommend to...
Inferring reward functions from human behavior is at the center of value...
Large language models (LLMs) transfer well to new tasks out-of-the-box s...
Reward hacking – where RL agents exploit gaps in misspecified reward
fun...
The literature on ranking from ordinal data is vast, and there are sever...
Traditional learning approaches for classification implicitly assume tha...
We study the problem of online learning with dynamics, where a learner
i...
We study the problem of robustly estimating the posterior distribution f...
We study the problem of robust linear regression with response variable
...
This paper develops the FastRNN and FastGRNN algorithms to address the t...
We study derivative-free methods for policy optimization over the class ...
In this paper, we study the problems of principal Generalized Eigenvecto...
In order to effectively interact with or supervise a robot, humans need ...
We study the problem of robust time series analysis under the standard
a...
We study the problem of Robust Least Squares Regression (RLSR) where sev...